Openlayer · Rate Limits

Openlayer Rate Limits

The Openlayer REST API authenticates with a Bearer API key and meters usage primarily through plan-level quotas, most notably the monthly cap on production inferences (rows published to inference pipelines) on the Basic plan. Openlayer does not publish explicit per-endpoint request-per-minute or token-per-minute limits in its public API reference; specific throttling thresholds are not reconciled in this artifact.

Openlayer Rate Limits is the machine-readable rate-limit profile for Openlayer on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 3 rate-limit definitions, measuring inferences, requests, and rows.

The profile also includes 3 backoff/retry policies defined and response codes documented for throttled.

Tagged areas include AI, Evaluation, Testing, Observability, and LLM.

3 Limits Throttle: 429
AIEvaluationTestingObservabilityLLMMLOpsRate LimitingQuotasThrottling

Limits

Production Inferences (Basic plan) account
inferences
20000 per month
Rows published to inference pipelines; Basic plan cap. Enterprise is unlimited.
Requests Per Minute (RPM) account
requests
see provider documentation
Per-endpoint request rate limits are not published in the public API reference.
Data Stream Throughput inference_pipeline
rows
see provider documentation
Batch size and ingestion throughput for the data-stream endpoint are not publicly specified.

Policies

API Key Authentication
All requests require a Bearer API key in the Authorization header; quotas apply per workspace/account.
Tiered Limits
Usage ceilings are governed by plan; Basic has a monthly inference cap, Enterprise removes caps.
Backoff Strategy
Clients should implement exponential backoff with jitter and honor Retry-After on 429 responses.

Sources