Replicate · Rate Limits

Replicate Rate Limits

Replicate API rate limits per account.

Replicate Rate Limits is the machine-readable rate-limit profile for Replicate on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 4 rate-limit definitions, measuring predictions_per_second, requests_per_second, and concurrent.

The profile also includes response codes documented for throttled.

Tagged areas include Rate Limiting and ML Inference.

4 Limits Throttle: 429
Rate LimitingML Inference

Limits

Predictions create (default) account
predictions_per_second · second
10
Predictions create (paid raised) account
predictions_per_second · second
100
Other endpoints account
requests_per_second · second
60
Concurrent predictions account
concurrent
varies

Sources