Principal Engineer, AI Inference Reliability

Cerebras

📍 , , Canada, , , Canada, Canada

Full-time Other-General Posted March 03, 2026

Job Description

About the team

The Cerebras Inference team’s mission is to deliver the world’s most performant, secure, and reliable enterprise‑grade AI service. We build and operate large‑scale distributed systems that power AI inference at unprecedented speed and efficiency. Join us to help scale inference and accelerate AI.

About the role

We’re looking for a hands‑on Reliability Tech Lead (IC) to own the mission of making Cerebras Inference the most reliable AI service in the world. You will drive reliability strategy and execution across our inference stack, from client SDKs and public‑cloud multi‑region deployments to wafer‑scale systems in specialized data centers.

In this role, you will define SLOs and incident‑response frameworks, design and implement reliability mechanisms at scale, and partner across hundreds of engineers to ensure our service meets world‑class reliability standards.

If you are passionate about building and operating massive‑scale...