Cerebras · Rate Limits
Cerebras Rate Limits
Scaffolded rate limit definitions for the Cerebras API surface. Captures per-tier quotas, burst behavior, response signaling, and recovery semantics. Defaults are scaffold values to be replaced with published provider limits.
Cerebras Rate Limits is the machine-readable rate-limit profile for Cerebras on the APIs.io network, conforming to the API Commons Rate Limits specification.
It captures 2 rate-limit definitions, across the free and pro tiers, measuring requests_per_minute.
The profile also includes response codes documented for throttled, quotaExceeded, and serviceUnavailable.
Tagged areas include AI Inference, Large Language Models, Wafer Scale, Hardware, and Cloud.
2 Limits
Throttle: 429
Quota: 429
AI InferenceLarge Language ModelsWafer ScaleHardwareCloudOpenAI CompatibleLLMSDKAcceleratorHigh Performance ComputingRate LimitingQuotasThrottling
Limits
Free Tier Default api-key
10
Pro Tier Default api-key
120