OpenAI Realtime API

The Realtime API enables low-latency, bidirectional communication with models that natively support speech-to-speech interactions as well as multimodal inputs (audio, images, and text) and outputs (audio and text). It supports WebRTC, WebSocket, and SIP connection methods for real-time voice agents and conversational interfaces.