Gravitee · Rate Limits

Gravitee Rate Limits

Name: Gravitee Rate Limits
Creator: Gravitee
Keywords: Rate Limiting, API Gateway, API Management, Event Streaming

Gravitee is a self-hosted (or Gravitee-hosted) API and Event Management platform; it does not impose per-call rate limits on customer traffic. Instead, it provides Rate Limit, Quota, and Spike Arrest policies that customers configure on their own gateways to throttle their consumers. Plan-level limits (gateways included, support tier) are commercial caps rather than runtime request caps, and Gravitee advertises "unlimited API calls and events" within the subscription.

Gravitee Rate Limits is the machine-readable rate-limit profile for Gravitee on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 5 rate-limit definitions, measuring requests_per_month, gateway, requests_per_second, requests_per_period, and llm_tokens_per_period.

The profile also includes 6 backoff/retry policies defined and response codes documented for throttled, quotaExceeded, and serviceUnavailable.

Tagged areas include Rate Limiting, API Gateway, API Management, and Event Streaming.

5 Limits Throttle: 429 Quota: 429

Rate LimitingAPI GatewayAPI ManagementEvent Streaming

Limits

Customer-facing API calls (subscription) subscription

requests_per_month · month

-1

Gravitee subscription plans cover unlimited API calls and events; the monthly subscription fee is the only volume-related ceiling.

Production gateways (per plan) subscription

gateway

see plan

Planet=1, Galaxy=2, Universe=4+, Comet=1, Meteor=2, Asteroid=4. Beyond these, additional gateways require an upgrade or add-on.

Configurable Rate Limit policy api/consumer

requests_per_second

operator-defined

API publishers configure per-second rate limits inside the Gravitee Rate Limit policy on each API.

Configurable Quota policy api/consumer/subscription

requests_per_period

operator-defined

API publishers set per-day / per-week / per-month quotas via the Quota policy.

AI Prompt Token Tracking (Agent Management) api/consumer/model

llm_tokens_per_period

operator-defined

The AI Prompt Token Tracking policy meters input / output tokens per consumer, per LLM route. Combined with the Quota or Rate Limit policy, it enables LLM-cost throttling.

Policies

Rate Limit policy

Short-window throttling (e.g. requests per second) applied at the gateway, with configurable burst and refill semantics.

Quota policy

Longer-window quotas (per day / week / month) per consumer or per subscription plan.

Spike Arrest

Smooths traffic bursts by rejecting requests above an allowed peak per smaller time bucket.

Backoff strategy

When 429 is returned by the customer-configured policy, downstream consumers should honor Retry-After and use exponential backoff.

AI Prompt Token Tracking policy

Meters LLM input / output token usage per consumer, per route, per model. Backed by Redis or Hazelcast distributed counters and feeds the Quota / Rate Limit policy for AI cost caps.

AI Prompt Guard-Rails policy

Allow / deny / classify prompts before they reach an LLM upstream; rejects matching requests with 4xx response codes.

Gravitee Rate Limits

Limits

Policies

Sources