Senior Engineer for GPU Inference at NVIDIA (Toronto)

NVIDIA

📍 toronto, on, Canada

Full-time Other-General Posted June 05, 2026

Job Description

Job Overview

Join NVIDIA as a Senior Engineer and build cutting-edge AI inference systems that serve large-scale models with astounding efficiency. Focus on optimizing GPU performance and collaborating with top experts. In this pivotal role, you will have the opportunity to architect high-performance inference stacks and optimize NVIDIA's GPU solutions for maximum productivity. Your expertise will be instrumental in achieving industry-leading benchmarks and implementing state-of-the-art GPU kernels within a collaborative, multi-cloud framework. Leverage your skills in performance engineering at NVIDIA to drive AI innovation. Key Responsibilities

Develop and optimize features for vLLM with latest GPU tech Benchmark and profile GPU kernels for efficiency Create tools for inference benchmarking methodologies Lead orchestration of large-scale inference deployments Publish research to advance ML Systems Requirements

Extensive background in CS with advanced degree ...