Predibase · Pricing Plans

Predibase Plans Pricing

Name: Predibase Plans Pricing
Creator: Predibase
Keywords: AI, LLM, Fine-Tuning, Inference, LoRA, Plans

Predibase uses usage-based pricing across three meters: serverless inference billed per token, fine-tuning (training) billed per token of training data scaled by base-model size, and dedicated deployments billed per GPU-hour by accelerator type. A free tier provides serverless inference up to a daily and monthly token cap; the Developer tier adds self-serve dedicated A10/A100 deployments; the Enterprise tier adds VPC / multi-GPU (A100/H100) deployments and negotiated terms.

Predibase Plans Pricing is the machine-readable pricing-plan profile for Predibase on the APIs.io network, conforming to the API Commons Plans specification.

It defines 4 plans, covering free, usage, and enterprise tiers, with named plans including Free, Pay-as-you-go, Developer (Dedicated Deployments), Enterprise.

Tagged areas include AI, LLM, Fine-Tuning, Inference, and LoRA.

4 Plans API Commons Plans

View Source

AILLMFine-TuningInferenceLoRAPlans

Plans

Free free

Serverless inference on shared endpoints up to a daily and monthly token cap, with no cost to start.

Serverless Inference (free allowance) (tokens · month) free up to ~1M tokens/day and ~10M tokens/month USD

Shared Endpoints
OpenAI-Compatible Inference

Pay-as-you-go usage

Token-metered serverless inference, token-metered fine-tuning, and GPU-hour-metered dedicated deployments with no monthly minimum beyond usage.

Serverless Inference (tokens · month) ~$0.20 per 1M tokens (small models; varies by model size) USD

Batch Inference (tokens · month) ~$0.50 per 1M tokens (flat, input and output) USD

Fine-Tuning (up to 7B) (tokens · month) from ~$0.36 per 1M training tokens USD

Fine-Tuning (large / MoE, e.g. Mixtral-8x7B) (tokens · month) up to ~$3.21 per 1M training tokens USD

Serverless Inference
Batch Inference
Supervised Fine-Tuning
Reinforcement Fine-Tuning (GRPO)

Developer (Dedicated Deployments) usage

Self-serve dedicated deployments billed per GPU-hour on A10 and A100 accelerators for production traffic.

Dedicated A10G (24GB) (hours · month) from ~$1.82 per GPU-hour USD

Dedicated A100 (80GB) (hours · month) per GPU-hour (see pricing page) USD

Dedicated Deployments
LoRA / Turbo LoRA Serving

Enterprise enterprise

VPC / private cloud deployments, multi-GPU and H100 deployments, dedicated support, and negotiated terms. Contact Predibase sales.

Enterprise Agreement (contract · year) contact sales USD

VPC / Private Deployments
H100 and Multi-GPU
Custom Volume Pricing

Predibase Plans Pricing

Plans

Sources