Solicitar ahora

JL4 Site Reliability Engineer (SRE)

Maersk Role Overview: We are from the Technology Operations Platform, and our vision is to improve the user experience of business users and the engineering community by making technology operations simpler and more efficient. Our ultimate goal is to develop a comprehensive suite of tools and solutions that empower Site Reliability Engineering (SRE) teams to seamlessly get started and effectively manage performance and reliability across the organization.

As a JL4 Site Reliability Engineer at Maersk, you will play a critical role in ensuring the reliability, scalability, and performance of our global systems. You will work closely with development and operations teams to automate processes, build resilient infrastructure, and drive continuous improvement. This role demands strong expertise in Golang, Python, Ansible, and SRE principles, with a focus on automation and observability. AI/ML knowledge is a valuable plus.

Key Focus Areas: Status Page: Emphasis on maintaining a status page for transparency and incident communication. Zero-Touch Automation: Highlighting strategies to eliminate manual interventions. Middleware & Microservices: Demonstrating architectural know-how for robust service interactions. Observability: Utilize observability tools and practices to build scalable SRE and automation solutions for global platforms and users. Collaborate effectively with the Observability team to enhance system insights without overlapping responsibilities. Operational Excellence: Applying principles to improve reliability and team efficiency.

Key Responsibilities: - Design, implement, and maintain scalable and reliable infrastructure using Golang, Python, and Ansible. - Develop automations to eliminate manual, redundant toil. - Collaborate with cross-functional teams to define SLIs, SLOs, and error budgets. - Monitor system performance and availability using tools like Prometheus and Grafana. - Conduct root cause analysis and postmortems for incidents. - Drive adoption of SRE best practices across engineering teams. - Participate in on-call rotations and proactively prevent incidents. - Support AI/ML workloads and infrastructure where applicable. - Demonstrate strong expertise in SRE and automation technologies. - Act as a problem solver and critical thinker in complex technical scenarios.

Required Qualifications: Bachelor?s or Master?s degree in Computer Science, Engineering, or a related field. 8+ years of experience in SRE, DevOps, or backend engineering roles. Strong programming skills in Golang, Python, and Ansible. Hands-on experience with at least one cloud platform (AWS, GCP, or Azure). Proficiency in container orchestration (Kubernetes, Docker). Deep understanding of distributed systems and reliability engineering. Experience with monitoring, logging, and alerting systems. Excellent problem-solving and communication skills

Maersk is committed to a diverse and inclusive workplace, and we embrace different styles of thinking. Maersk is an equal opportunities employer and welcomes applicants without regard to race, colour, gender, sex, age, religion, creed, national origin, ancestry, citizenship, marital status, sexual orientation, physical or mental disability, medical condition, pregnancy or parental leave, veteran status, gender identity, genetic information, or any other characteristic protected by applicable law. We will consider qualified applicants with criminal histories in a manner consistent with all legal requirements.

 

We are happy to support your need for any adjustments during the application and hiring process. If you need special assistance or an accommodation to use our website, apply for a position, or to perform a job, please contact us by emailing  [email protected]

Solicitar ahora

Otros empleos