Haystack / deepset · Rate Limits

Haystack Ai Rate Limits

The open-source Haystack framework and self-hosted Hayhooks deployments are not rate limited by deepset - throughput is bounded only by your own infrastructure and by the model/embedding providers your pipelines call. The hosted deepset Cloud REST API applies request timeouts (about 2 minutes for search and 3 minutes for other requests) and standard pagination, and enforces account/plan-level quotas that are governed by enterprise agreement rather than a published per-minute table. Specific numeric limits are not reconciled in this artifact.

Haystack Ai Rate Limits is the machine-readable rate-limit profile for Haystack / deepset on the APIs.io network, conforming to the API Commons Rate Limits specification.

It captures 5 rate-limit definitions, measuring duration, items, and requests.

The profile also includes 2 backoff/retry policies defined and response codes documented for throttled.

Tagged areas include AI, LLM, RAG, Open Source, and Rate Limiting.

5 Limits Throttle: 429
AILLMRAGOpen SourceRate LimitingQuotasThrottling

Limits

Search Request Timeout request
duration
~120 seconds
Search requests time out at roughly 2 minutes.
Other Request Timeout request
duration
~180 seconds
Non-search requests time out at roughly 3 minutes.
Pagination Limit request
items
see provider documentation
Page-based (page_number/limit) and cursor-based (before/after) pagination.
Account Quotas account
requests
governed by plan/contract
Hosted deepset Cloud quotas are set by enterprise agreement, not published.
Self-Hosted (Haystack / Hayhooks) deployment
requests
none imposed by provider
Throughput bounded only by your infrastructure and downstream model providers.

Policies

Plan-Governed Limits
Hosted limits are defined per customer plan and enterprise agreement.
Backoff Strategy
Clients should implement exponential backoff with jitter and honor Retry-After on 429.

Sources