Open WebUI · Rate Limits

Open Webui Rate Limits

Name: Open Webui Rate Limits
Creator: Open WebUI
Keywords: LLM, Open Source, Self-Hosted, Ollama, Chat UI, RAG, Rate Limiting, Quotas, Throttling

Open WebUI does not impose project-level API rate limits. Effective limits are determined by (1) the upstream LLM backend's limits (Ollama concurrency, or OpenAI/Anthropic RPM caps) and (2) any reverse-proxy or admin throttling configured in the deployment. Standard HTTP semantics apply.

Open Webui Rate Limits is the machine-readable rate-limit profile for Open WebUI on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 2 rate-limit definitions, measuring n/a and requests.

The profile also includes 2 backoff/retry policies defined and response codes documented for throttled.

Tagged areas include LLM, Open Source, Self-Hosted, Ollama, and Chat UI.

2 Limits Throttle: 429

LLMOpen SourceSelf-HostedOllamaChat UIRAGRate LimitingQuotasThrottling

Limits

Project-level n/a

n/a

no built-in cap

Open WebUI itself does not throttle; configure at reverse proxy if needed.

Upstream LLM backend external

requests

backend-defined

Ollama concurrency or OpenAI/Anthropic RPM caps apply.

Policies

Reverse-Proxy Throttling

Use Nginx/Caddy/Traefik in front of Open WebUI to enforce per-IP or per-user limits if needed.

Backend Concurrency

Tune Ollama OLLAMA_NUM_PARALLEL or OPENAI_API_BASE_URLS to spread load.

Open Webui Rate Limits

Limits

Policies

Sources