Requesty
Requesty is an LLM routing and gateway platform that exposes a single OpenAI-compatible API across 300+ models from providers like OpenAI, Anthropic, DeepSeek, and Together AI. The Requesty Router adds intelligent routing, automatic fallbacks, response caching, spend controls, and per-request cost observability on top of unified inference.
APIs
Requesty Chat Completions API
OpenAI-compatible chat completions routed across 300+ models from OpenAI, Anthropic, DeepSeek, Together AI, and more, with streaming, tool use, web search, automatic fallbacks, ...
Requesty Models API
Lists the 300+ models routable through the Requesty gateway with their identifiers, provider, context length, and per-token pricing.
Requesty Usage & Analytics API
Retrieves per-key and organization-level usage statistics, request cost, and spend reporting for observability and FinOps across the gateway.
Requesty API Keys API
Programmatically create, list, inspect, and delete API keys and manage their spending limits, labels, and expiration for governing gateway access.
Event Specifications
Requesty Chat Completions Streaming (HTTP + SSE)
AsyncAPI 2.6 description of Requesty's **chat completion streaming** surface. Requesty does not publish a WebSocket API. The only asynchronous / event-style transport documented...
ASYNCAPI