Job Description
We are looking for a skilled AI Engineer with a strong focus on testing, evaluating, and operationalizing Large Language Models (LLMs) to join our growing team. In this role, you will ensure that our language models meet high standards of accuracy, robustness, safety, and performance, and that they integrate seamlessly into our Speech-to-Text and AI-driven application landscape.
You will work closely with product, full-stack, and infrastructure engineers to transform state-of-the-art language models into reliable, production-ready systems that solve real customer problems. You make prototypes production ready.
Key Responsibilities
LLM Evaluation & Testing
- Design and maintain systematic evaluation frameworks for LLMs, including:
- Automated test suites
- Golden datasets
- Regression benchmarks
- Define quantitative metrics (e.g., accuracy, latency, hallucination rate, task success)...