Bitbucket remote jobs in Berlin Otto-Suhr-Allee 6-16

Hybrid Lead Site Reliability Engineer (f/m/x) - API Platforms

Deutsche Bank · Berlin Otto-Suhr-Allee 6-16 · Germany · Hybrid

Apply Now Logo

Blinkist – Key book insights in 15 minutes. Save 40% now!

Sponsored by Blinkist

Job Description:

Deutsche Bank Technology in Berlin

DB Technology is a global team of technology specialists, spread across multiple trading hubs and tech centres. We have a strong focus on promoting technical excellence – our engineers work at the forefront of financial services innovation using cutting-edge technologies. 

Our Berlin location is our most recent addition to our global network of tech centres and growing strongly. We are committed to building a diverse workforce and to creating excellent opportunities for talented engineers and technologists. Our tech teams and business units use agile ways of working to create #GlobalHausbank solutions from our home market.

API Platforms and Integration Services

Deutsche Bank API Platforms and Integration Services team orchestrates internal and external API Platforms, portals, enabling services and embedded finance products in global level. The team is a highly skilled and innovative group dedicated to developing cutting-edge solutions and services that leverage the power of APIs to drive digital transformation and enhance the banking experience for clients worldwide.

As a Lead Site Reliability Engineer, you will be responsible for the SRE activities across platforms, portals and enabling services together with other SREs and engineers.

-> You love this job but feel you cannot tick 100% of the boxes? Send us your CV anyway!

Your key responsibilities

  • As Lead Site Reliability Engineer you

    • Orchestrate and contribute SRE activities across API Platforms and Integration services

    • Introduce all engineering disciplines that combine software- and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems

    • Implement the core of DevOps with specific principles and practices, focusing on “what” and “how” to improve reliability

    • Establish and support capacity planning procedures and have a close eye on SLIs and SLOs for production readiness and in live environment

    • Coordinate with the rest of the division and the teams working on different layers of the application and infrastructure, and you have full commitment to collaboration on problem solving

  • For Infrastructure & Service Management you

    • Engage in and improve the whole lifecycle of services - from inception and design, deployment, operation, and refinement

    • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health

    • Scale systems sustainably through mechanisms like automation; evolve systems by pushing for changes that improve reliability and velocity

    • Develop and enforce policies, standards and guidelines for site reliability

    • Automate application and infrastructure deployment activities to production environments

  • For Incident & Problem Management you

    • Perform troubleshooting & Emergency Response

    • Investigate root causes and suggest solutions

    • Increase the productivity by leading blameless post-mortems

  • For Application Maintenance you

    • Collaboratively work with Product Owners and Engineers to run reliable services

    • Configure and maintains application & monitoring

    • Identify business objects for monitoring

    • Track system performance, capacity, and use your experience to create effective strategies for maintaining and improving system performance and availability

  • For Operational Continuous Improvement you

    • Identify issues and optimization potential and introduce related user stories

    • Support with automation knowhow to reduce the risk of bad changes

    • Identify, design, develop, deploy tools and processes to monitor, maintain, and report site performance and availability

  • For Service Onboarding you

    • Support your Squad and your Chapter population in onboarding & promotions

Your skills and experiences

  • Hands-on experience with cloud ecosystems run on Google Cloud

  • Hands-on experience with Docker / Kubernetes operations with GKE or similar technology

  • Expert experience with automated infrastructure provisioning based on Terraform/TerraGrunt, Terraform Enterprise, Ansible

  • Advanced hands-on experience with Continuous Integration / Continuous Deployment (Github) and patterns for CI/CD pipelines.

  • Advanced hands-on experience of monitoring tools like Prometheus, Grafana, Kibana and alerting tools like OpsGenie, NewRelic, DataDog, Splunk, Google Operations-Suite (Stackdriver)

  • Very good knowledge of security capabilities (TLS, OAuth2, KMS, Vault, Admission Controllers, let's encrypt or similar technologies).

  • Very good understanding of Microservice architectures and experience with API Management with Apigee or WSO2

  • Experience in software development in at least one language (Java, JavaScript, Python, Go)

  • Good Knowledge of the Software Development Life Cycle processes based on related tools such as

    • TeamCity, BitBucket, Artifactory

    • SonarQube, VeraCode, Crucible

    • JIRA, Confluence, Service Now

What we offer

We provide you with a comprehensive portfolio of benefits and offerings to support both, your private and professional needs.

  • Emotionally and mentally balanced
    A positive mind helps us master the challenges of everyday life – both professionally and privately. We offer consultation in difficult life situations as well as mental health awareness trainings.

  • Physically thriving
    We support you in staying physically fit through an offering to maintain personal health and a professional environment. You can benefit from health check-ups; vaccination drives as well as advice on healthy living and nutrition.

  • Socially connected
    Networking opens up new perspectives, helps us thrive professionally and personally as well as strengthens our self-confidence and well-being. You can benefit from PME family service, FitnessCenter Job, flexible working (e.g parttime, hybrid working, job tandem) as well as an extensive culture of diversity, equity and inclusion.

  • Financially secure
    We provide you with financial security not only during your active career but also for the future. You can benefit from offerings such as pension plans, banking services, company bicycle or “Deutschlandticket”.


Since our offerings slightly vary across locations, please contact your recruiter with specific questions.

This job is available in full and parttime.


In case of any recruitment related questions, please get in touch with Kilian Weber.

Contact Kilian Weber: +49 30 34073087

We strive for a culture in which we are empowered to excel together every day. This includes acting responsibly, thinking commercially, taking initiative and working collaboratively.

Together we share and celebrate the successes of our people. Together we are Deutsche Bank Group.

We welcome applications from all people and promote a positive, fair and inclusive work environment.