Nvidia Rate Limits
NVIDIA's developer API surface is multi-product. build.nvidia.com hosted NIM endpoints have per-account / per-API-key rate limits and free-credit budgets that rotate by promotion; specific RPM/TPM numbers are not consistently published. Self-hosted NIM (via AI Enterprise license) has no NVIDIA-side rate limits — throughput is bounded by the customer's GPU hardware. NGC downloads are throttled per-IP by the catalog CDN.
Nvidia Rate Limits is the machine-readable rate-limit profile for Nvidia on the APIs.io network, conforming to the API Commons Rate Limits specification.
It captures 3 rate-limit definitions, measuring varies and requests_per_second.
The profile also includes 4 backoff/retry policies defined and response codes documented for unauthorized, forbidden, throttled, and serviceUnavailable.
Tagged areas include GPU, AI, Machine Learning, Computing, and Graphics.