AHEAD builds platforms for digital business. By weaving together advances in cloud infrastructure, automation and analytics, and software delivery, we help enterprises deliver on the promise of digital transformation.
At AHEAD, we prioritize creating a culture of belonging, where all perspectives and voices are represented, valued, respected, and heard. We create spaces to empower everyone to speak up, make change, and drive the culture at AHEAD.
We are an equal opportunity employer, and do not discriminate based on an individual's race, national origin, color, gender, gender identity, gender expression, sexual orientation, religion, age, disability, marital status, or any other protected characteristic under applicable law, whether actual or perceived.
We embrace all candidates that will contribute to the diversification and enrichment of ideas and perspectives at AHEAD.
We are seeking an experienced AI Platform Engineer to design, deploy, and optimize AI/ML infrastructure, AI workflows, and automated pipelines. This role focuses on building scalable environments for training and deploying machine learning models, leveraging modern orchestration, automation, and GPU acceleration technologies. You will collaborate with data scientists and platform engineers to drive efficient resource utilization and scalable operations across cloud and hybrid environments.
Key Responsibilities
Kubernetes for AI/ML: Architect and manage Kubernetes clusters tailored to AI/ML workloads.
GPU Orchestration: Implement Run:ai and operators for GPU resource orchestration and workload scheduling.
Automation & Pipelines: Develop and maintain Python-based automation scripts and ML pipelines; automate infrastructure provisioning with Terraform and configuration management with Ansible.
Notebooks & Collaboration: Create and manage Jupyter Notebooks for experimentation and collaboration.
NVIDIA Integration: Integrate and optimize NVIDIA Enterprise Suite components (CUDA, NeMo Framework, Triton, TensorRT, GPU drivers) for accelerated computing.
MLOps Practices: Establish and maintain MLOps best practices for model lifecycle management, CI/CD, and monitoring (e.g., MLflow, Kubeflow).
Collaboration: Work closely with data scientists and platform engineers to ensure efficient resource utilization and scalability across environments.
Required Skills & Experience
Strong proficiency in Python and experience with ML frameworks (TensorFlow, PyTorch).
Hands-on experience with Kubernetes and container orchestration.
Familiarity with Run:ai or similar GPU scheduling platforms.
Expertise in Terraform and Ansible for infrastructure automation.
Experience with Jupyter Notebooks for ML development.
Knowledge of NVIDIA Enterprise Suite (CUDA, NeMo Framework, Triton, GPU drivers).
Solid understanding of MLOps principles and tools (e.g., MLflow, Kubeflow).
Background in deploying and scaling AI workloads in cloud or hybrid environments.
Qualifications
4+ years in platform architecture or solutions architecture, with 2+ years focused on AI/ML workloads.
Experience with high-performance computing (HPC) environments.
Familiarity with distributed training and model optimization techniques.
Certification in Kubernetes or cloud platforms (AWS, Azure, GCP).
Additional Information
Why AHEAD:
Through our daily work and internal groups like Moving Women AHEAD and RISE AHEAD, we value and benefit from diversity of people, ideas, experience, and everything in between.
We fuel growth by stacking our office with top-notch technologies in a multi-million-dollar lab, by encouraging cross department training and development, sponsoring certifications and credentials for continued learning.
These cookies are necessary for the website to function and cannot be turned off in our systems. You can set your browser to block these cookies, but then some parts of the website might not work.
Security
User experience
Target group oriented cookies
These cookies are set through our website by our advertising partners. They may be used by these companies to profile your interests and show you relevant advertising elsewhere.
Google Analytics
Google Ads
We use cookies
🍪
Our website uses cookies and similar technologies to personalize content, optimize the user experience and to indvidualize and evaluate advertising. By clicking Okay or activating an option in the cookie settings, you agree to this.
The best remote jobs via email
Join 5'000+ people getting weekly alerts with remote jobs!