Hybrid Production Engineer bei Systematica Investments
Systematica Investments · London, Vereinigtes Königreich · Hybrid
- Senior
- Optionales Büro in London
Key Responsibilities
Platform Engineering & Automation (Core SRE Focus – ~50%)
- Build and maintain automated tools for deployment, health checks, alerts, and runbooks.
- Lead efforts in observability, including metrics instrumentation, logging, and dashboards.
- Develop self-healing mechanisms for recurring production issues.
- Continuously reduce manual operational work ("toil") through scripting.
Reliability Engineering & Incident Management (~30%)
- Monitor health of trading systems with a goal of proactive failure prevention.
- Own and improve incident response, root cause analysis, and blameless post-mortems.
- Design and validate failover and disaster recovery strategies.
- Collaborate with developers to design robust, testable deployment pipelines.
- Support trading operations during market hours, with occasional coverage through late shifts (to 11pm).
- Interface with internal users (trading, ops, quant teams) and external vendors for production-level concerns.
- Help guide releases during system maintenance windows with safe deployment practices.
- Maintain and expand documentation for system behavior, runbooks, and escalation flows.
- Languages: Python (primary), Bash, T-SQL
- OS/Infrastructure: Linux, Windows, Docker, AWS Cloud services
- Monitoring & Alerting: DataDog, Grafana, custom tooling
- Automation/CI/CD: Git, TeamCity, Ansible, Terraform (optional)
- Databases: MS SQL Server, Snowflake
- Any other duties commensurate with the post holder’s position and seniority; and
- All employees should understand that it is their personal responsibility to comply with all organisational, statutory and regulatory policies and procedures.
Skills, Knowledge and Expertise
- 5+ years in a production-facing engineering role within finance or other mission-critical tech domains.
- Proven experience with automation, observability, and incident response in distributed systems.
- Comfort with scripting and systems programming (Python, Bash).
- Experience with config management and container orchestration tools.
- Strong communication and debugging skills, especially under pressure.
- Analytical mindset and curiosity-driven troubleshooting.
- Calm, decisive demeanor during critical incidents.
- Empathy for both internal users and downstream systems.
- Bias toward eliminating root causes rather than treating symptoms.
- Educated to degree (or equivalent) level or higher, preferably from a leading university.
- Bachelor's or Master’s in Computer Science, Engineering, or equivalent field.