Llamacloud Rate Limits
LlamaCloud enforces per-account limits expressed as request concurrency and throughput on parsing, extraction, and retrieval, plus a monthly credit allowance per plan that effectively caps page/query volume. Concurrency and rate limits scale with the subscription tier, with Enterprise offering roughly 5x higher rate limits. Parsing, extraction, and indexing are asynchronous job-based workloads, so callers poll job status rather than holding long synchronous connections. Specific per-tier numeric limits are not reconciled in this artifact.
Llamacloud Rate Limits is the machine-readable rate-limit profile for LlamaCloud on the APIs.io network, conforming to the API Commons Rate Limits specification.
It captures 5 rate-limit definitions, measuring concurrent_jobs, requests, queries, and credits.
The profile also includes 3 backoff/retry policies defined and response codes documented for throttled.
Tagged areas include AI, Document Parsing, Extraction, Indexing, and Retrieval.