Platzhalter Bild

Homeoffice MLOps / DevSecOps Engineer – AI Infrastructure Team - 1658 at In All Media Inc

In All Media Inc · Argentina, Argentina · Remote

Apply Now

📌 Job Title: MLOps / DevSecOps Engineer – AI Infrastructure Team

Location: Remote from LATAM | Full-time
Company: Inallmedia.com

🚀 About the Role

We're looking for a hands-on MLOps / DevSecOps Engineer to own and automate infrastructure that powers our AI platforms, ensuring robustness, cost-efficiency, and compliance at scale. You’ll work at the intersection of CI/CD, observability, and security, enabling high-throughput pipelines for code, data, model, and prompt deployments in regulated environments.

You’ll help build secure, reproducible, and cost-aware AI infrastructure, with advanced control over scalability, rollback strategies, and model performance monitoring.

🛠️ Key Responsibilities

  • Build reusable Infrastructure as Code (IaC) modules for GPU clusters, distributed storage, and Zero-Trust networks

  • Deploy and operate Kubernetes clusters with GPU-optimized node pools using Cluster Autoscaler or Karpenter

  • Orchestrate multi-stage GitOps pipelines (code, data, model) with ArgoCD or Flux

  • Implement advanced rollout/rollback strategies: shadow testing, canary, blue/green

  • Integrate security scanners into CI/CD: OWASP ZAP, Snyk, Veracode, Trivy with actionable reports

  • Set up observability for model drift, budget burn, SLOs, and hallucination metrics

  • Ensure compliance with SOC 2, ISO 27034, and internal audit requirements

  • Reproduce high-performance compute (HPC) environments via Terraform or AWS CDK

🧠 Ideal Candidate

  • Solid background in Cloud DevOps and ML Infrastructure

  • Proven experience with reproducible GPU infrastructure in cloud environments

  • Hands-on expertise in:

    • Infrastructure as Code: Terraform, Pulumi, AWS CDK

    • Kubernetes: EKS, AKS, GKE, Cluster Autoscaler, Karpenter

    • GitOps tools: Argo CD, Flux, GitHub Actions, Azure DevOps, Kustomize

    • Observability: Prometheus, OpenTelemetry, Grafana, Arize AI, FinOps Exporter

    • Security & Compliance: Trivy, Snyk, OWASP ZAP, Veracode, Grype, Kyverno, OPA Gatekeeper

☁️ Infrastructure & Environment

  • Remote-first across LATAM (6+ hrs CST/EST overlap)

  • VPN + SSO access via Okta or Azure Active Directory

  • Cloud IDE access or VS Code Dev Containers

  • GitHub Enterprise, Jira, Slack/Teams

⚙️ Nice to Have

  • Familiarity with tools such as MLflow, LangChain, LangSmith, Ray, DVC, Feast, BentoML

  • Background in budget-aware engineering or FinOps tagging strategies

  • Comfortable with advanced CLI workflows (e.g., Tmux, SSH multiplexing)

🔧 Recommended Stack (Expanded)

  • IaC: Terraform, AWS CDK, Pulumi

  • Kubernetes: EKS/AKS/GKE, Cluster Autoscaler, Karpenter

  • CI/CD & GitOps: GitHub Actions, GitLab CI, Argo CD, Flux, Kustomize

  • Monitoring & Logging: Prometheus, Grafana, OpenTelemetry, ELK Stack, Datadog

  • Cloud Providers: AWS, Azure, GCP

  • Security Tools: Trivy, OWASP ZAP, Snyk, Veracode, Grype, Kyverno, OPA Gatekeeper

  • ML Frameworks: PyTorch, TensorFlow

  • Containerization & Automation: Docker, Ansible

Apply Now

Other home office and work from home jobs