Leonardo Ai Rate Limits
Reconciled concurrency, queue, and rate-limit behaviour for the Leonardo.AI Production API. Leonardo enforces per-API-key concurrency limits (parallel in-flight generations) on top of a queueing system rather than a strict RPM/TPM token bucket. Excess requests are queued, not 429'd, up to the queue depth.
Leonardo Ai Rate Limits is the machine-readable rate-limit profile for Leonardo.AI on the APIs.io network, conforming to the API Commons Rate Limits specification.
It captures 2 rate-limit definitions, across the Default Production API and Synchronous utility endpoints tiers.
The profile also includes response codes documented for unauthorized, forbidden, notFound, validationError, rateLimited, and serverError.
Tagged areas include AI, Rate Limiting, Concurrency, and Queue.