Protect AI · Rate Limits

Protectai Rate Limits

The primary developer surface, the LLM Guard API, is self-hosted: there is no vendor-imposed rate limit. Throughput is bounded only by the resources of the deployment you operate (CPU/GPU, worker count, model load), and operators may add their own gateway-level limits. The commercial platform products (Guardian, Recon, Layer) run as managed services whose limits are governed by account agreements and are not publicly documented. No specific per-endpoint values are reconciled in this artifact.

Protectai Rate Limits is the machine-readable rate-limit profile for Protect AI on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 2 rate-limit definitions, measuring requests.

The profile also includes 2 backoff/retry policies defined and response codes documented for throttled.

Tagged areas include AI, ML, Security, LLM, and Guardrails.

2 Limits Throttle: 429
AIMLSecurityLLMGuardrailsRate LimitingQuotasThrottling

Limits

LLM Guard API (Self-Hosted) deployment
requests
no provider-imposed limit
Bounded by your own compute and worker configuration; add gateway limits as needed.
Guardian / Recon / Layer (Commercial) account
requests
see account agreement
Managed-service limits set by contract; not publicly documented.

Policies

Self-Hosted Throughput
Scale LLM Guard horizontally and suppress unused scanners to raise effective throughput; no vendor quota applies.
Backoff Strategy
Clients should implement exponential backoff with jitter and honor Retry-After when fronting LLM Guard with a gateway or calling commercial endpoints.

Sources