Inkeep · Rate Limits

Inkeep Rate Limits

Inkeep's AI / RAG chat completions endpoint applies per-IP rate throttling and bounds each chat session to roughly 30 messages, with recommended input of <=100 tokens and output of <=1,000 tokens per request. Inkeep does not publish exact numeric per-account RPM/TPM ceilings; effective limits depend on plan and quoted usage. Specific values are not reconciled in this artifact.

Inkeep Rate Limits is the machine-readable rate-limit profile for Inkeep on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 5 rate-limit definitions, measuring requests, messages, and tokens.

The profile also includes 2 backoff/retry policies defined and response codes documented for throttled.

Tagged areas include AI, Support, RAG, Agents, and Documentation.

5 Limits Throttle: 429
AISupportRAGAgentsDocumentationRate LimitingQuotasThrottling

Limits

Per-IP Throttling ip
requests
see provider documentation
The AI / RAG chat completions endpoint throttles requests per IP address.
Messages Per Chat Session session
messages
30
A single chat session is bounded to approximately 30 messages.
Recommended Input Tokens request
tokens
100
Recommended maximum input of about 100 tokens per request.
Recommended Output Tokens request
tokens
1000
Recommended maximum output of about 1,000 tokens per request.
Account Rate Limits account
requests
see provider documentation
Per-account ceilings depend on plan and quoted usage; not publicly numbered.

Policies

Plan-Based Limits
Effective limits scale with the quoted self-serve or Enterprise plan.
Backoff Strategy
Clients should implement exponential backoff with jitter and honor 429 responses.

Sources