Prime Intellect Inference API
OpenAI-compatible inference API for hosted frontier and open models served at api.pinference.ai. Supports streaming chat completions, the full set of OpenAI parameters (temperature, top_p, max_tokens, logprobs), and returns a `usage` object with input/output token counts and USD cost on every response. LoRA adapters can be served alongside base models via 1-click deployments.