Open WebUI · Rate Limits
Open Webui Rate Limits
Open WebUI does not impose project-level API rate limits. Effective limits are determined by (1) the upstream LLM backend's limits (Ollama concurrency, or OpenAI/Anthropic RPM caps) and (2) any reverse-proxy or admin throttling configured in the deployment. Standard HTTP semantics apply.
Open Webui Rate Limits is the machine-readable rate-limit profile for Open WebUI on the APIs.io network, conforming to the API Commons Rate Limits specification.
It captures 2 rate-limit definitions, measuring n/a and requests.
The profile also includes 2 backoff/retry policies defined and response codes documented for throttled.
Tagged areas include LLM, Open Source, Self-Hosted, Ollama, and Chat UI.
2 Limits
Throttle: 429
LLMOpen SourceSelf-HostedOllamaChat UIRAGRate LimitingQuotasThrottling
Limits
Project-level n/a
no built-in cap
Open WebUI itself does not throttle; configure at reverse proxy if needed.
Upstream LLM backend external
backend-defined
Ollama concurrency or OpenAI/Anthropic RPM caps apply.
Policies
Reverse-Proxy Throttling
Use Nginx/Caddy/Traefik in front of Open WebUI to enforce per-IP or per-user limits if needed.
Backend Concurrency
Tune Ollama OLLAMA_NUM_PARALLEL or OPENAI_API_BASE_URLS to spread load.