Jina AI · Rate Limits

Jina Ai Rate Limits

Name: Jina Ai Rate Limits
Creator: Jina AI
Keywords: Rate Limiting, AI, Embeddings, LLM

Jina AI applies per-API-key rate limits across three tiers (Free, Paid, Premium). Limits are enforced as RPM (requests per minute), TPM (tokens per minute), and concurrent in-flight requests and apply uniformly across all Search Foundation services (Embeddings, Reranker, Reader, Classifier, Segmenter, DeepSearch). Tier is determined by token balance / billing status on the key.

Jina Ai Rate Limits is the machine-readable rate-limit profile for Jina AI on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 9 rate-limit definitions, measuring requests_per_minute, tokens_per_minute, and concurrent_requests.

The profile also includes 4 backoff/retry policies defined and response codes documented for throttled and quotaExceeded.

Tagged areas include Rate Limiting, AI, Embeddings, and LLM.

9 Limits Throttle: 429 Quota: 402

Rate LimitingAIEmbeddingsLLM

Limits

Free API Key — RPM api-key

requests_per_minute · minute

100

Free API Key — TPM api-key

tokens_per_minute · minute

100000

Free API Key — concurrency api-key

concurrent_requests

Paid API Key — RPM api-key

requests_per_minute · minute

500

Paid API Key — TPM api-key

tokens_per_minute · minute

2000000

Paid API Key — concurrency api-key

concurrent_requests

Premium API Key — RPM api-key

requests_per_minute · minute

5000

Premium API Key — TPM api-key

tokens_per_minute · minute

50000000

Premium API Key — concurrency api-key

concurrent_requests

500

Policies

Tier upgrade

Tier (Free / Paid / Premium) is determined by token balance / billing status on the API key; topping up tokens promotes the key to a higher tier with higher RPM, TPM, and concurrency.

Backoff

Honor Retry-After on 429 and back off exponentially with jitter.

Token exhaustion

When the token balance reaches zero the API returns 402 Payment Required; top up via Stripe in the API Dashboard.

Shared across services

Rate-limit and concurrency caps apply across all Jina services on the same key, not per-endpoint.

Jina Ai Rate Limits

Limits

Policies

Sources