AI Semantic Cache Plugin
Caches LLM responses by prompt similarity so semantically equivalent requests can be served from cache, reducing latency and provider spend.
Caches LLM responses by prompt similarity so semantically equivalent requests can be served from cache, reducing latency and provider spend.