Technical Lead, Multimodal Infrastructure

OpenAI

📍 san francisco, ca, United-States

Full-time Other-General Posted June 14, 2026

Job Description

About the Team

The Multimodal Research team at OpenAI is building the next generation of AI systems that can understand and generate content across multiple modalities—including text, audio, images, and video. The team’s mission is to unlock new capabilities by enabling models to process and reason about diverse data types simultaneously.

This team sits at the intersection of cutting-edge research and production, and plays a central role in shaping OpenAI's multimodal offerings—from speech-to-speech agents to video understanding and image generation. Recent projects include real-time voice agents, modular speech components, and fine-tuning pipelines that support product launches like GPT-4o

About the Role

As a TL (Tech Lead) for the multimodal infrastructure team, you will help provide technical leadership to a high-caliber team of ML infrastructure engineers supporting every multimodal research initi...