Replicate · Rate Limits
Replicate Rate Limits
Replicate API rate limits per account.
Replicate Rate Limits is the machine-readable rate-limit profile for Replicate on the APIs.io network, conforming to the API Commons Rate Limits specification.
It captures 4 rate-limit definitions, measuring predictions_per_second, requests_per_second, and concurrent.
The profile also includes response codes documented for throttled.
Tagged areas include Rate Limiting and ML Inference.
4 Limits
Throttle: 429
Rate LimitingML Inference
Limits
Predictions create (default) account
10
Predictions create (paid raised) account
100
Other endpoints account
60
Concurrent predictions account
varies