Predibase
Predibase is a platform for fine-tuning and serving open-source LLMs. It pairs efficient LoRA / Turbo LoRA supervised and reinforcement (GRPO) fine-tuning with serverless and dedicated inference powered by LoRAX, the open-source multi-LoRA serving stack that packs hundreds of adapters onto a single GPU. Inference is exposed through an OpenAI-compatible API plus native generate endpoints.
APIs
Predibase Inference (OpenAI-Compatible) API
OpenAI-compatible chat completions and completions served from Predibase serverless and dedicated deployments, with per-request LoRA adapter selection via the model field and SS...
Predibase Prompt / Generate API
Native text-generation endpoints (generate and generate_stream) for prompting deployed base models and fine-tuned adapters, with adapter source selection (pbase, hub, or s3) and...
Predibase Fine-Tuning API
Create and manage supervised and reinforcement (GRPO) fine-tuning jobs that train efficient LoRA / Turbo LoRA adapters on top of open-source base models, returning adapter versi...
Predibase Adapters API
Manage adapter repositories and the trained adapter versions inside them - the LoRA artifacts produced by fine-tuning jobs that are loaded onto deployments for inference.
Predibase Deployments API
Create, read, update, and delete dedicated and private serverless deployments, selecting a base model and GPU accelerator (A10, A100) and enabling LoRA serving for fine-tuned ad...
Predibase Datasets API
Connect and manage datasets used as input to fine-tuning jobs, uploaded from files or referenced from connected storage.
Predibase Models API
List the open-source base models supported on Predibase for fine-tuning and serving, with metadata used when creating jobs and deployments.
Predibase Batch Inference API
Launch asynchronous batch inference jobs against a base model with per-row adapter selection, billed at a flat per-million-token batch rate for non-realtime workloads.
Event Specifications
Predibase Inference Streaming (HTTP + SSE)
AsyncAPI 2.6 description of Predibase's **inference streaming** surface. Predibase does not publish a WebSocket API. The only asynchronous / event-style transport documented at ...
ASYNCAPI