Contextual AI · Rate Limits

Contextual Ai Rate Limits

Name: Contextual Ai Rate Limits
Creator: Contextual AI
Keywords: AI, RAG, LLM, Grounded Language Model, Enterprise, Rate Limiting, Quotas, Throttling

The Contextual AI platform enforces per-workspace rate limits and request constraints across its APIs. Documented hard input constraints include a 32,000-token total limit on Generate requests, a 7,000-token total limit on LMUnit requests, and Parse file limits of 300 MB and 2,000 pages per file. Per-endpoint request-per-minute and token-per-minute throttles vary by tier and are not publicly reconciled in this artifact; enterprise agreements raise limits. Throttled requests return HTTP 429.

Contextual Ai Rate Limits is the machine-readable rate-limit profile for Contextual AI on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 6 rate-limit definitions, measuring tokens, bytes, pages, and requests.

The profile also includes 3 backoff/retry policies defined and response codes documented for throttled.

Tagged areas include AI, RAG, LLM, Grounded Language Model, and Enterprise.

6 Limits Throttle: 429

AIRAGLLMGrounded Language ModelEnterpriseRate LimitingQuotasThrottling

Limits

Generate Request Tokens request

tokens

32000

Total tokens (messages + knowledge + output) per Generate request.

LMUnit Request Tokens request

tokens

7000

Total input tokens per LMUnit evaluation request.

Parse File Size request

bytes

314572800

Maximum 300 MB per file submitted to Parse.

Parse File Pages request

pages

2000

Maximum 2,000 pages per file submitted to Parse.

Requests Per Minute (RPM) workspace

requests

see provider documentation

Per-endpoint RPM varies by tier; not publicly reconciled.

Tokens Per Minute (TPM) workspace

tokens

see provider documentation

Per-endpoint TPM varies by tier; not publicly reconciled.

Policies

Job Result Retention

Parse job status and results are retained for 30 days; older requests return 404.

Tiered Limits

Limits raise with paid usage and via Enterprise agreements.

Backoff Strategy

Clients should implement exponential backoff with jitter and honor Retry-After on 429.

Contextual Ai Rate Limits

Limits

Policies

Sources