Senior Runtime Engineer

Cerebras

📍 Toronto, ON, Canada

Full-time Other-General Posted February 19, 2026

Job Description

Overview

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This enables industry-leading training and inference speeds and allows machine learning users to run large-scale ML applications without managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras to deploy scale and high-speed inference.

Thanks to the wafer-scale architecture, Cerebras Inference offers fast Generative AI inference, significantly faster than GPU-based hyperscale cloud inference services. This accelerates the user experience of AI applications and enables real-time iteration and deeper model intelligence.

About The Role

We are building the next generation ...