Claude Rate Limits
The Claude API enforces organization-level usage tiers (Tier 1-4 plus Monthly Invoicing) that gate monthly spend, plus per-minute rate limits expressed as RPM, ITPM (input tokens per minute), and OTPM (output tokens per minute) per model class. Organizations advance tiers automatically as cumulative credit purchases reach $5 / $40 / $200 / $400 (Tier 1-4); higher limits and Monthly Invoicing are negotiated through sales. Cache reads do not count against ITPM on Claude 4.x models (a key advantage of prompt caching), and the Message Batches API has its own RPM and processing queue caps. Managed Agents endpoints have a separate organization-wide cap (300 RPM creates / 600 RPM reads). All limits are enforced via the token-bucket algorithm and surfaced through detailed anthropic-ratelimit-* response headers.
Claude Rate Limits is the machine-readable rate-limit profile for Claude on the APIs.io network, conforming to the API Commons Rate Limits specification.
It captures 38 rate-limit definitions, measuring requests_per_minute, input_tokens_per_minute, output_tokens_per_minute, and spend_per_month.
The profile also includes 8 backoff/retry policies defined and response codes documented for throttled and serviceUnavailable.
Tagged areas include Artificial Intelligence, Generative AI, Large Language Models, and Rate Limiting.