Parasail Inference API
OpenAI-compatible real-time and streaming inference API exposing serverless access to popular open-weight LLMs, embedding models, and the model catalog. Endpoints: /v1/chat/completions, /v1/completions, /v1/embeddings, /v1/models. Bearer-token authentication; pay-per-token billing; supports streaming, tool use, and structured outputs. Compatible with the OpenAI Python and TypeScript clients by overriding base_url.