ControlUp creates an autonomous workplace where the day runs itself.
We’re a leader in DEX, unifying digital employee experience and IT operations into one powerful platform built for modern workplace management. By combining real-time monitoring, automation, and proactive remediation, ControlUp enables IT teams to prevent issues before they impact employees, reduce operational complexity, and streamline IT environments, without the clutter of multiple tools. With ControlUp, IT works smarter, employees stay productive, and the workday runs itself. One platform. One powerful shift in how work flows.
No tool sprawl. No wasted time. No interruptions. Just technology that runs smoothly, so people can get on with work that matters.
The Role
We are seeking a highly skilled Site Reliability Engineer (SRE) to own production stability, system performance, financial operations (FinOps), and cost of goods sold (COGS) management in a large-scale environment. You will work closely with engineering, product, and customer teams to ensure our advanced technology stack is optimized to meet and exceed customer SLAs.
ControlUp creates an autonomous workplace where the day runs itself.We’re a leader in DEX, unifying digital employee experience and IT operations into one powerful platform built for modern workplace management. By combining real-time monitoring, automation, and proactive remediation, ControlUp enables IT teams to prevent issues before they impact employees, reduce operational complexity, and streamline IT environments, without the clutter of multiple tools. With ControlUp, IT works smarter, employees stay productive, and the workday runs itself. One platform. One powerful shift in how work flows.No tool sprawl. No wasted time. No interruptions. Just technology that runs smoothly, so people can get on with work that matters.The RoleWe are seeking a highly skilled Site Reliability Engineer (SRE) to own production stability, system performance, financial operations (FinOps), and cost of goods sold (COGS) management in a large-scale environment. You will work closely with engineering, product, and customer teams to ensure our advanced technology stack is optimized to meet and exceed customer SLAs.
How You’ll Spend Your Day
Maintain and improve production stability across a large-scale infrastructure with thousands of Kubernetes nodes and instances
Monitor, analyze, and optimize system performance to ensure seamless user experience and SLA adherence
Implement and drive FinOps practices to manage cloud cost efficiency and cost of goods sold (COGS) effectively
Utilize ControlUp and other advanced monitoring/observability tools to proactively detect issues and ensure SLA compliance
Collaborate with development and operations teams to automate deployments, scaling, and incident response
Design and implement robust alerting, incident management, and post-mortem processes
Continuously evaluate and adopt cutting-edge technologies to improve reliability, performance, and cost efficiency
Provide technical guidance and best practices for infrastructure and application scalability
Participate in on-call rotations to respond to critical incidents and minimize downtime
Your Experience and Qualifications
Proven experience as an SRE or similar role in large-scale environments with thousands of Kubernetes nodes and instances
Strong expertise in Kubernetes, container orchestration, and cloud infrastructure (AWS, GCP, Azure, or similar)
Solid understanding of performance tuning, monitoring, and observability tools (experience with ControlUp is a strong plus)
Experience with FinOps principles and tools to manage cloud costs and optimize resource utilization
Deep knowledge of production incident management, root cause analysis, and SLA management
Proficiency in scripting and automation (Python, Go, Bash, etc.).Familiarity with CI/CD pipelines and infrastructure as code (Terraform, Helm, etc.).Excellent communication skills and ability to work collaboratively across teams
Ces cookies sont nécessaires au fonctionnement du site web et ne peuvent pas être désactivés dans nos systèmes. Vous pouvez configurer votre navigateur pour qu'il bloque ces cookies, mais certaines parties du site risquent alors de ne pas fonctionner.
Sécurité
Expérience utilisateur
Cookies ciblés
Ces cookies sont placés par nos partenaires publicitaires via notre site web. Ils peuvent être utilisés par ces entreprises pour créer un profil de vos intérêts et vous montrer des publicités pertinentes ailleurs.
Google Analytics
Google Ads
Nous utilisons des cookies
🍪
Notre site web utilise des cookies et des technologies similaires pour personnaliser le contenu, optimiser l'expérience de l'utilisateur, individualiser et évaluer la publicité. En cliquant sur OK ou en activant une option dans les paramètres des cookies, vous acceptez cela.
Les meilleurs emplois à distance par courriel
Rejoins 5'000+ personnes qui reçoivent des alertes hebdomadaires avec des emplois à distance!