- Bureau à Paris
What we do
Doctolib’s Engineering environment is rich and we are building innovative products and features aiming each day to ease doctors' and patient life. We are looking for a Site Reliability Engineer II to keep Doctolib production systems running smoothly. You will also be a key-player to support the exponential growth of Doctolib services.
What you will do
As a Site Reliability Engineer II at Doctolib, you will play a critical role in fostering a platform-oriented approach to reliability and performance, empowering teams to embrace the “You build it, you run it” culture.
Your role:
- Platform Reliability: Design, build, and maintain the core platform infrastructure to enable scalability and resilience, ensuring the platform can support hundreds of thousands of concurrent users
- Automation and Efficiency: Develop tools and processes to automate the deployment, scaling, and lifecycle management of services, reducing toil and increasing reliability
- Monitoring and Incident Management: Implement robust monitoring, alerting, and incident response mechanisms to detect and resolve issues before they impact practitioners or patients
- Disaster Recovery: Design and execute disaster recovery strategies to ensure business continuity in critical scenarios
- Collaborate with Feature Teams: Partner with product and engineering teams to embed reliability best practices, enhance performance, and instill operational excellence into their workflows
- Continuous Improvement: Research and evaluate emerging technologies and tools to continuously enhance platform reliability and operational practices
- On-Call Ownership: Participate in an on-call rotation to maintain a proactive, efficient response to incidents, reinforcing the “You build it, you run it” philosophy
Who you are
You could be our next team mate if you:
- Have a solid hands-on experience (3y+) on a large-scale production platform
- Have proven experience with cloud platforms such as AWS, Azure or Google Cloud
- Have solid understanding of containerization and orchestration technologies (Docker and Kubernetes)
- Have a strong understanding of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows
- Have proficiency in at least one programming language (Ruby, Python, Go, Java, etc.) and a deep understanding of infrastructure as code principles
- Have an experience with monitoring and observability tools
- Like troubleshooting performance issues in complex environments
- Speak English
What we offer
- Free Health Insurance for you & your family
- Up to 14 days of RTT
- Parental care program (1 month off in addition to the legal parental leave and 0,5 days off per child when the school starts)
- Wellbeing program (free mental health and coaching offer with our partner moka.care)
- A flexible workplace policy offering both hybrid and office-based mode
- Flexibility days allowing to work in EU countries and the UK 10 days per year
- Lunch voucher with Swile card
- Work Council subsidy to refund part of sport club membership or creative class
- Bicycle subsidy
The interview process
- Recruiter interview
- Technical SRE interview
- System Design interview
- Behavioral interview
- Background / Reference check
- Offer!