Gravitee · Rate Limits

Gravitee Rate Limits

Gravitee is a self-hosted (or Gravitee-hosted) API and Event Management platform; it does not impose per-call rate limits on customer traffic. Instead, it provides Rate Limit, Quota, and Spike Arrest policies that customers configure on their own gateways to throttle their consumers. Plan-level limits (gateways included, support tier) are commercial caps rather than runtime request caps, and Gravitee advertises "unlimited API calls and events" within the subscription.

Gravitee Rate Limits is the machine-readable rate-limit profile for Gravitee on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 5 rate-limit definitions, measuring requests_per_month, gateway, requests_per_second, requests_per_period, and llm_tokens_per_period.

The profile also includes 6 backoff/retry policies defined and response codes documented for throttled, quotaExceeded, and serviceUnavailable.

Tagged areas include Rate Limiting, API Gateway, API Management, and Event Streaming.

5 Limits Throttle: 429 Quota: 429
Rate LimitingAPI GatewayAPI ManagementEvent Streaming

Limits

Customer-facing API calls (subscription) subscription
requests_per_month · month
-1
Gravitee subscription plans cover unlimited API calls and events; the monthly subscription fee is the only volume-related ceiling.
Production gateways (per plan) subscription
gateway
see plan
Planet=1, Galaxy=2, Universe=4+, Comet=1, Meteor=2, Asteroid=4. Beyond these, additional gateways require an upgrade or add-on.
Configurable Rate Limit policy api/consumer
requests_per_second
operator-defined
API publishers configure per-second rate limits inside the Gravitee Rate Limit policy on each API.
Configurable Quota policy api/consumer/subscription
requests_per_period
operator-defined
API publishers set per-day / per-week / per-month quotas via the Quota policy.
AI Prompt Token Tracking (Agent Management) api/consumer/model
llm_tokens_per_period
operator-defined
The AI Prompt Token Tracking policy meters input / output tokens per consumer, per LLM route. Combined with the Quota or Rate Limit policy, it enables LLM-cost throttling.

Policies

Rate Limit policy
Short-window throttling (e.g. requests per second) applied at the gateway, with configurable burst and refill semantics.
Quota policy
Longer-window quotas (per day / week / month) per consumer or per subscription plan.
Spike Arrest
Smooths traffic bursts by rejecting requests above an allowed peak per smaller time bucket.
Backoff strategy
When 429 is returned by the customer-configured policy, downstream consumers should honor Retry-After and use exponential backoff.
AI Prompt Token Tracking policy
Meters LLM input / output token usage per consumer, per route, per model. Backed by Redis or Hazelcast distributed counters and feeds the Quota / Rate Limit policy for AI cost caps.
AI Prompt Guard-Rails policy
Allow / deny / classify prompts before they reach an LLM upstream; rejects matching requests with 4xx response codes.

Sources