Platzhalter Bild

High-Performance Computing Systems Engineer en Nextonic Solutions LLC

Nextonic Solutions LLC · Rockville, Estados Unidos De América · Onsite

Solicitar ahora

Nextonic Solutions is seeking a High-Performance Computing (HPC) Systems Engineer to join our vibrant team at the National Institutes of Health (NIH) supporting the The National Center for Advancing Translational Sciences (NCATS) located in Rockville, MD.


The High-Performance Computing (HPC) Systems Engineer will support the Scientific Computing and Informatics (SCI) team at The National Center for Advancing Translational Sciences (NCATS). This role will focus on the design, optimization, security, and maintenance of HPC and cloud-based infrastructures that enable cutting-edge biomedical research through scalable, secure, and high-performing computing environments.



Responsibilities:


  • Design, configure, and maintain scalable HPC clusters for optimal performance.
  • Support documentation and ATO (Authority to Operate) processes.
  • Ensure infrastructure design compliance with federal security standards and best practices.
  • Implement monitoring tools such as XDMoD for transparency and user reporting.
  • Integrate platforms such as JupyterHub and job schedulers (e.g., Slurm) for improved interactivity.
  • Develop and manage AWS-based infrastructure using Terraform, Packer, and Ansible.
  • Automate deployment workflows to streamline provisioning, updates, and scaling.
  • Manage systems involved in AWS Secure Cloud Bridging (SCB) and STRIDES initiatives.
  • Implement CIS benchmark-aligned system hardening using OpenSCAP.
  • Administer optimized compute images (CPU/GPU) for scientific workflows.
  • Leverage tools such as OpenHPC, Warewulf, and Ansible for environment management.
  • Lead and coordinate quarterly patch cycles.
  • Partner with researchers and external stakeholders on critical projects.
  • Facilitate solution transitions to other NIH centers and collaborators.
  • Contribute to publications and team objectives through deep technical engagement.



Qualifications:


  • Federal ATO processes experience required
  • HPC architecture and performance optimization is required
  • Scientific software development and deployment
  • High-speed network and parallel file system architecture
  • Troubleshooting, diagnostics, and technical support
  • Strong communication and multitasking skills


Programming & Scripting:

  • Languages - Pascal, BASIC, Delphi, Visual Basic, C, C++
  • Scripting - Bash, Perl, Python, Ruby, PEAR, Tcl

Systems & Network Administration:

  • Linux – RHEL/CentOS, SUSE, Debian, Ubuntu
  • Windows – 95–10; NT–Server 2016
  • Networking – Active Directory, TCP/IP v4/v6, DHCP, DNS, WINS
  • Legacy – NOVELL 3.1–5, VPN, Citrix, Terminal Services

Monitoring & Management Tools:

  • Nagios, Ganglia, HP BAC, Precise i3
  • SGI SMC, HP PCM, Bright Cluster Manager (incl. Data Analytics)

Infrastructure & Automation:

  • Puppet, Cobbler, Ansible, Chef
  • Red Hat Satellite, Kickstart, RPM optimization

File Systems & Archiving:

  • Panasas (DirectFlow/panfs), DDN (GPFS), SGI DMF, StorHouse/RFS (Filetek)

HPC Tools & Job Scheduling:

  • MOAB/MAUI, Torque, PBS Pro, Windows HPC Scheduler

Visualization & Remote Access:

  • Nice DCV, EnginFrame, VNC, OpenText Exceed OnDemand, Web Remote Desktop

Containerization & GPU:

  • Docker, Kubernetes, Kubeflow, NVIDIA DGX-1 GPU systems

Databases:

  • SQL Server (2000–2008), MySQL, Zope

High-Speed Networking:

  • Infiniband, Mellanox, OFED, Voltaire, Force10


Proven experience in:

  • HPC architecture and performance tuning
  • Cybersecurity in HPC/cloud environments
  • Infrastructure as Code (AWS, Terraform, Ansible, Packer)
  • Supporting scientific workflows in research environments


Solicitar ahora

Otros empleos