Senior Site Reliability Engineer - Observability (x/f/m) chez Doctolib
Doctolib · Paris, France · Remote
Your Impact
What you'll do
- Lead the observability strategy across the platform, with an emphasis on building scalable, developer-friendly logging and tracing capabilities
- Identify and lead large-scale cross-cutting reliability initiatives, including improvements to our incident detection, response, and postmortem analysis capabilities
- Take part in the on-call rotation, and actively contribute to improving our on-call experience by refining alerting, reducing noise, and ensuring actionable telemetry
Who you are
- Have a solid hands-on experience (3y+) on a large-scale production platform
- Have proven experience with cloud platforms such as AWS, Azure or Google Cloud
- Have solid understanding of containerization and orchestration technologies (Docker and Kubernetes)
- Have a strong understanding of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows
- Have deep expertise in observability tooling and architecture, such as:
- Logging: Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Logstash, Vector
- Tracing: OpenTelemetry or proprietary APMs
- Metrics: Prometheus, Thanos, Datadog, or equivalent
- Have proficiency in at least one programming language (Ruby, Python, Go, Java, etc.) and a deep understanding of infrastructure as code principles
- Have experience with monitoring and observability tools
- Like troubleshooting performance issues in complex environments
- Are fluent in English
- Have experience contributing to open-source observability projects
- Have worked in a high-growth tech environment
- Are passionate about developer experience and platform engineering
Life at Doctolib Tech
- Our solutions are built on a single fully cloud-native platform that supports web and mobile app interfaces, multiple languages, and is adapted to country and healthcare specialty requirements.
- Our stack is composed of Rails, TypeScript, Java, Python, Kotlin, Swift, and React Native.
- We leverage AI ethically across our products to empower patients and health professionals. Discover our AI vision here.
What we offer
- Free comprehensive health insurance (basic package) for you and your children
- 25 days of paid vacation per year, plus up to 14 days of RTT
- Free mental health and coaching services through our partner Moka.care
- Work from abroad for up to 10 days per year thanks to our flexibility days policy
- Lunch vouchers (Swile card) worth €8.50 per working day, with €4.50 covered by Doctolib
- A subsidy from the work council to refund part of the membership to a sport club or a creative class
- 50% reimbursement of your public transport subscription
- Parent Care Program: receive one additional month of leave on top of the legal parental leave
- Enrollment in Doctolib's long-term employee value sharing plan called DoctoGrowth
- For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
- Relocation support in case of international mobility
- Access to the best AI tools for coding, development and dedicated training
Our interview process
- Recruiter Interview
- Technical SRE Interview
- System Design Interview
- Behavioral Interview
- At least one reference check
Job details
- Permanent position
- Tech stack: Kubernetes, Prometheus, OpenTelemetry, Loki, ArgoCD, Ruby, Python, Go
- Full-time
- Paris, France
- Hybrid work setup (up to 2 remote days per week)
- Start date: as soon as possible