TensorFlow · Rate Limits

Tensorflow Rate Limits

TensorFlow is an open-source library, not a hosted API service; there is no central TensorFlow rate limiter. Inference throughput is bounded only by the hardware and serving stack (TensorFlow Serving, TF Lite runtime, custom inference servers) the user deploys.

Tensorflow Rate Limits is the machine-readable rate-limit profile for TensorFlow on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 1 rate-limit definition, measuring varies.

The profile also includes 1 backoff/retry policy defined.

Tagged areas include Rate Limiting, Machine Learning, and Open Source.

1 Limits
Rate LimitingMachine LearningOpen Source

Limits

Self-Hosted Inference deployment
varies
bounded by self-hosted hardware / serving stack

Policies

Self-Hosted Throughput
Throughput is a function of the user's deployment (CPU/GPU/TPU, batch size, model size, serving stack); TensorFlow itself does not throttle.

Sources