Postuler maintenant
About the Role 
We're building an ambitious internal AI Platform to power Bright's next generation of AI-driven products and services. This Kubernetes-hosted platform provides teams across the organisation with the tools to build, deploy, and observe AI-powered applications without managing complex infrastructure themselves. 

As an AI Platform Engineer, you'll join a small, high-impact team building critical platform infrastructure for LLM operations (LLMOps). Working under the supervision of two senior/principal platform engineers and reporting to the Head of AI, you'll be instrumental in delivering self-service AI capabilities that enable developers across Bright to build sophisticated AI applications with confidence. 
This is an opportunity to work on cutting-edge AI infrastructure, learn from experienced platform engineers, and make a significant impact on how Bright leverages AI technology at scale. 



Key Responsibilities

Our roadmap spans multiple interconnected platform epics. You'll contribute to key initiatives including: 
Core Platform Services 
  • Observability & Experimentation: Enhancing Langfuse for LLM tracing, evaluation, and experimentation capabilities 
  • Developer Self-Service: Building and improving Backstage as an internal developer portal for platform discoverability 
  • LLM Operations: Deploying and maintaining LiteLLM proxy, Langflow runtime, and other core LLM services 
  • Monitoring & Logging: Implementing platform-wide monitoring (Prometheus/Grafana) and logging infrastructure (Loki) 
Security & Compliance 
  • LLM Ops Security: Implementing guardrails (LlamaGuard, Azure Guardrails) and security controls 
  • GDPR & PII Management: Building automated PII detection, minimization strategies, and compliance tooling 
  • Incident Response: Establishing security incident response procedures for LLM operations 
Infrastructure & Reliability 
  • Kubernetes Operations: Managing AKS clusters, implementing reliable deployment tooling via ArgoCD 
  • Infrastructure as Code: Productionizing infrastructure with Terraform, eliminating manual configuration 
  • Autoscaling & Performance: Implementing workload management and autoscaling for AI services 
  • Storage Solutions: Migrating from self-hosted MinIO to managed Azure Blob Storage 
Applications Support 
You'll also support the deployment and operation of AI applications built on the platform, including: 
  • RAG (Retrieval-Augmented Generation) applications like Ask IPASS and Ask UK Pay Centre 
  • Document processing applications (BrightCapture) 
  • Employee onboarding automation (Oscar) 
  • Internal AI assistant (Bright GPT) 

Skills, Knowledge and Expertise

What We're Looking For 
Essential Skills & Experience 
  • Platform Engineering Fundamentals: 2-4 years experience with cloud infrastructure, preferably Azure 
  • Kubernetes: Practical experience deploying and managing applications in Kubernetes (AKS experience is a plus) 
  • Infrastructure as Code: Hands-on experience with Terraform or similar IaC tools 
  • CI/CD: Experience with GitOps workflows and tools like ArgoCD, GitHub Actions, or similar 
  • System Programming: Proficiency in Python or Go for automation and tooling; shell scripting essential 
  • Linux & Containers: Solid understanding of containerization with Docker and container orchestration 
Desirable Experience 
  • Exposure to LLM technologies or AI/ML infrastructure 
  • Experience with observability tools (Prometheus, Grafana, Loki) 
  • Knowledge of Helm and Helmfile for Kubernetes deployments 
  • Knowledge of Kustomize 
  • Understanding of security best practices and compliance requirements (GDPR) 
  • Backend-as-a-Service platforms (Supabase or similar) 
  • Developer portal platforms (Backstage or similar) 
  • Application programming experience with .NET and/or TypeScript 
What Makes You a Great Fit 
  • Learning Mindset: You're excited to learn about LLM operations and emerging AI infrastructure patterns 
  • Systems Thinking: You understand how distributed systems work and can reason about failure modes 
  • Pragmatic Approach: You balance perfect solutions with shipping value quickly 
  • Collaboration: You work well with both technical and product stakeholders 
  • Documentation: You believe good documentation is as important as good code 
  • Ownership: You take responsibility for your work from development through to production 
Team Structure & Reporting 
  • Reports to: Head of AI 
  • Works closely with: Two senior/principal platform engineers 
  • Collaborates with: Application development teams, product managers, and security/compliance stakeholders 
  • Team size: Small, full-stack AI team covering development, DevOps, operations, and support 
What Success Looks Like 
In your first 3 months: 
  • You've contributed to multiple platform epics from our roadmap 
  • You understand the architecture of our AI platform and can navigate the codebase 
  • You've successfully deployed services to our Kubernetes clusters 
  • You're participating in on-call rotation and can troubleshoot platform issues 
In your first 6 months: 
  • You're independently owning epics and driving them to completion 
  • You're contributing to architectural decisions and technical direction 
  • You've improved platform reliability, observability, or developer experience 
  • You're mentoring junior engineers or helping onboard new team members 
Technical Stack 
Infrastructure: Azure (AKS, Blob Storage, Cognitive Services), Kubernetes, Terraform
Platform Services: LiteLLM, Langflow, Langfuse, Supabase, Open Web UI, Backstage
Observability: Prometheus, Grafana, Loki, Langfuse tracing
CI/CD: ArgoCD, GitHub Actions, Helmfile
Languages: Python, Go, Shell scripting
Security: Azure Guardrails, LlamaGuard, PII detection tooling 
Why Join Bright's AI Platform Team? 
  • Impact: Your work directly enables AI innovation across the entire organization 
  • Growth: Learn from experienced platform engineers in a supportive environment 
  • Cutting Edge: Work with the latest AI infrastructure and tooling 
  • Autonomy: Small team means you'll have significant ownership and influence 
  • Mission: Help accountants and finance professionals work more efficiently with AI 
 

Benefits

What will you get?  
  • Competitive salary  
  • Performance based bonus 
  • 25 days annual leave  
  • Health Insurance  
  • Company pension  
  • Company events  
  • free food onsite  
  • On-site parking  
  • Referral programme  
  • Sick pay  
  • Wellness programmes 
Postuler maintenant

Plus d'emplois