Not Diamond · Rate Limits
Notdiamond Rate Limits
Not Diamond does not publicly document specific request or token rate limits for its REST API. Access is gated by API key and account tier (Early Access vs. Enterprise). The router adds roughly 100-200 ms of latency per routing decision. Specific per-account RPM/TPM values are not reconciled in this artifact.
Notdiamond Rate Limits is the machine-readable rate-limit profile for Not Diamond on the APIs.io network, conforming to the API Commons Rate Limits specification.
It captures 3 rate-limit definitions, measuring requests, tokens, and milliseconds.
The profile also includes 2 backoff/retry policies defined and response codes documented for throttled.
Tagged areas include AI, LLM, Model Routing, Router, and Orchestration.
3 Limits
Throttle: 429
AILLMModel RoutingRouterOrchestrationRate LimitingQuotasThrottling
Limits
Requests Per Minute (RPM) account
see provider documentation
Not publicly documented; varies by account tier.
Tokens Per Minute (TPM) account
see provider documentation
Not publicly documented; routing billed per 1M tokens.
Router Latency request
~100-200 ms per routing decision
Average additional latency added by the router, not a throttling limit.
Policies
Tiered Access
Limits and capabilities differ between Early Access and Enterprise tiers.
Backoff Strategy
Clients should implement exponential backoff with jitter and honor Retry-After on 429 responses.