Product Manager (Infrastructure) at FluidStack
FluidStack · San Francisco, United States Of America · Onsite
- Professional
- Office in San Francisco
About FluidStack
Fluidstack is the AI Cloud Platform. We build GPU supercomputers for top AI labs, governments, and enterprises. Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more.
Our team is small, highly motivated, and focused on providing a world class supercomputing experience. We put our customers first in everything we do, working hard to not just win the sale, but to win repeated business and customer referrals.
We hold ourselves and each other to high standards. We expect you to care deeply about the work you do, the products you build, and the experience our customers have in every interaction with us.
You must work hard, take ownership from inception to delivery, and approach every problem with an open mind and a positive attitude. We value effectiveness, competence, and a growth mindset.
About the Role
We're looking for a Product Manager to define our infrastructure platform, including end-to-end hardware lifecycle, DCIM, and building controls.
You will work directly with our engineering and infrastructure teams as well as collaborate closely with our datacenter design and operations teams to ensure that we’re building the tools that enable us to deliver hundreds of thousands of GPUs of capacity and operate it reliably over a multi-year time horizon.
Focus
- Own the processes and tools to manage the lifecycle of all hardware within a datacenter environment, from shipment through to retirement. 
- Partner with infrastructure and engineering to translate datacenter design and operational requirements into technical specifications, including the DCIM system, as well as network automation and ZTP of compute infrastructure. 
- Work with external datacenter and operations partners to integrate their systems into our tooling. 
- Collaborate with hardware vendors on naming schemes, asset management, factory integration, RMA process, and other stages of the hardware lifecycle. 
About You
- 3-5 years of experience building developer tools or cloud infrastructure, ideally building DCIM tools, or managing the lifecycle of compute and networking infrastructure. 
- Strong understanding of AI/ML workloads and infrastructure, including GPU acceleration, model training and inference pipelines, and modern datacenter architecture. 
- Familiarity with DCIM tools like Netbox as well as bare metal provisioning and management tools (e.g. MaaS, Tinkerbell, Metal3, etc.) 
- Familiarity with industrial protocols like Modbus for telemetry/management of CDUs/UPSes/CRACs/ATSes/etc. 
- Knowledge of Infrastructure-as-Code (IaC) tools (e.g. Terraform, Pulumi) to standardize and simplify the deployment of our infrastructure stack 
- Understanding of SLA, SLO, frameworks and error budget management, as well as the ability to build new Grafana dashboards to track metrics that matter 
- Excellent communication and cross-functional leadership skills, with the ability to collaborate effectively across engineering and and operational stakeholders. 
- Comfortable designing and working with APIs. 
- Strong product intuition and taste in developer experience and tooling. 
Benefits
- Competitive total compensation package (salary + equity). 
- Retirement or pension plan, in line with local norms. 
- Health, dental, and vision insurance. 
- Generous PTO policy, in line with local norms. 
 
			 
			 
			 
			