OpenAI adds new voice models to its API

5/7/2026

OpenAI audio models bring GPT-Realtime-2, Translate, and Whisper to voice apps Build richer calls, live translation, and fast transcription with developer-ready pricing

OpenAI has introduced three new audio models in its API: GPTRealtime2, GPTRealtimeTranslate, and GPTRealtimeWhisper. The models are designed to help developers build voice apps that can reason more effectively, translate speech in real time, and transcribe audio as people speak. GPTRealtime2 is the company’s first voice model with GPT5class reasoning, aimed at handling more complex requests and maintaining natural conversation flow. GPTRealtimeTranslate supports live speech translation from more than 70 input languages into 13 output languages, while GPTRealtimeWhisper provides lowlatency streaming speechtotext. OpenAI said the tools are intended for applications such as customer support, travel, education, events, and business workflows. The company also highlighted safety controls in the Realtime API and said the new models are available now with published pricing for developers.