Job Title: HPC Systems Engineer - Enterprise Linux & Cluster Infrastructure
Location: Hybrid to Remote - Swansea (mostly remote & expenses for all site travel)
Day Rate: £465 per day - payable to Limited Company / Outside IR35
Duration: 4 months initially
Pay Frequency: Weekly
Start Date: ASAP
Overview We are seeking an experienced HPC Systems Engineer to deploy, configure, optimise and support high-performance computing (HPC) environments. This includes large-scale Linux clusters, GPU-accelerated systems, and associated storage, networking, and authentication infrastructure.Although this role is hybrid, it's expected to be largely remote / WFH.
Responsibilities
- Deploy, configure, optimise and support enterprise Linux across HPC clusters
- Automate provisioning and manage configuration at scale (PXE, Kickstart, Ansible/Puppet)
- Install, configure, and optimise HPC schedulers (e.g. Slurm) and MPI environments
- Deploy and manage GPU (NVIDIA/CUDA) and high-performance storage solutions
- Monitor, benchmark, and tune system performance across compute, network, and storage
- Implement authentication, security controls, and system hardening for multi-user environments
- Support HPC software stacks, toolchains, and container runtimes (Spack, EasyBuild, Apptainer)
- Maintain documentation and support user access/workflows
Requirements
- Strong Linux administration in HPC or large-scale environments
- Experience with automation and cluster provisioning
- Knowledge of Slurm, MPI, and parallel computing
- Experience with GPU/CUDA environments
- Understanding of system performance tuning
- Familiarity with identity management and security best practices
- Previous experience operating as an HPC Systems Engineer