AI Rate Limiting Advanced Plugin
Token-aware rate limiting tailored for LLM traffic, with per-consumer and per-model budgets rather than just request counts.
Token-aware rate limiting tailored for LLM traffic, with per-consumer and per-model budgets rather than just request counts.