Inkeep · Rate Limits

Inkeep Rate Limits

Name: Inkeep Rate Limits
Creator: Inkeep
Keywords: AI, Support, RAG, Agents, Documentation, Rate Limiting, Quotas, Throttling

Inkeep's AI / RAG chat completions endpoint applies per-IP rate throttling and bounds each chat session to roughly 30 messages, with recommended input of <=100 tokens and output of <=1,000 tokens per request. Inkeep does not publish exact numeric per-account RPM/TPM ceilings; effective limits depend on plan and quoted usage. Specific values are not reconciled in this artifact.

Inkeep Rate Limits is the machine-readable rate-limit profile for Inkeep on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 5 rate-limit definitions, measuring requests, messages, and tokens.

The profile also includes 2 backoff/retry policies defined and response codes documented for throttled.

Tagged areas include AI, Support, RAG, Agents, and Documentation.

5 Limits Throttle: 429

AISupportRAGAgentsDocumentationRate LimitingQuotasThrottling

Limits

Per-IP Throttling ip

requests

see provider documentation

The AI / RAG chat completions endpoint throttles requests per IP address.

Messages Per Chat Session session

messages

A single chat session is bounded to approximately 30 messages.

Recommended Input Tokens request

tokens

100

Recommended maximum input of about 100 tokens per request.

Recommended Output Tokens request

tokens

1000

Recommended maximum output of about 1,000 tokens per request.

Account Rate Limits account

requests

see provider documentation

Per-account ceilings depend on plan and quoted usage; not publicly numbered.

Policies

Plan-Based Limits

Effective limits scale with the quoted self-serve or Enterprise plan.

Backoff Strategy

Clients should implement exponential backoff with jitter and honor 429 responses.

Inkeep Rate Limits

Limits

Policies

Sources