Not Diamond · Rate Limits

Notdiamond Rate Limits

Name: Notdiamond Rate Limits
Creator: Not Diamond
Keywords: AI, LLM, Model Routing, Router, Orchestration, Rate Limiting, Quotas, Throttling

Not Diamond does not publicly document specific request or token rate limits for its REST API. Access is gated by API key and account tier (Early Access vs. Enterprise). The router adds roughly 100-200 ms of latency per routing decision. Specific per-account RPM/TPM values are not reconciled in this artifact.

Notdiamond Rate Limits is the machine-readable rate-limit profile for Not Diamond on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 3 rate-limit definitions, measuring requests, tokens, and milliseconds.

The profile also includes 2 backoff/retry policies defined and response codes documented for throttled.

Tagged areas include AI, LLM, Model Routing, Router, and Orchestration.

3 Limits Throttle: 429

AILLMModel RoutingRouterOrchestrationRate LimitingQuotasThrottling

Limits

Requests Per Minute (RPM) account

requests

see provider documentation

Not publicly documented; varies by account tier.

Tokens Per Minute (TPM) account

tokens

see provider documentation

Not publicly documented; routing billed per 1M tokens.

Router Latency request

milliseconds

~100-200 ms per routing decision

Average additional latency added by the router, not a throttling limit.

Policies

Tiered Access

Limits and capabilities differ between Early Access and Enterprise tiers.

Backoff Strategy

Clients should implement exponential backoff with jitter and honor Retry-After on 429 responses.

Notdiamond Rate Limits

Limits

Policies

Sources