Kong AI Gateway

AI Semantic Cache Plugin

Caches LLM responses by prompt similarity so semantically equivalent requests can be served from cache, reducing latency and provider spend.

Documentation GitHub

Documentation

📖

Documentation

https://developer.konghq.com/plugins/ai-semantic-cache/

API entry from apis.yml

aid: kong-ai-gateway:ai-semantic-cache-plugin
name: AI Semantic Cache Plugin
description: Caches LLM responses by prompt similarity so semantically equivalent requests can be served
  from cache, reducing latency and provider spend.
humanURL: https://developer.konghq.com/plugins/ai-semantic-cache/
tags:
- Plugin
- Caching
- Semantic
properties:
- type: Documentation
  url: https://developer.konghq.com/plugins/ai-semantic-cache/