LLMWhisperer
LLMWhisperer is a document-to-text extraction API from Unstract (Zipstack) that turns complex PDFs, scanned documents, and images into clean, layout-preserving text ready for large language models. It exposes an asynchronous REST API (v2) - submit a document to /whisper, poll /whisper-status, then retrieve the extracted text via /whisper-retrieve - plus line-level highlight coordinates and webhook callbacks. Authentication is via the unstract-key header.
APIs
LLMWhisperer Whisper Extraction API
Submits a document (PDF, image, or URL) to POST /whisper for asynchronous, layout-preserving text extraction across native_text, low_cost, high_quality, form, and table modes. R...
LLMWhisperer Whisper Status API
GET /whisper-status returns the processing state (accepted, processing, processed, error, retrieved) for a whisper_hash, with per-page execution detail.
LLMWhisperer Whisper Retrieve API
GET /whisper-retrieve returns the extracted result_text plus optional confidence_metadata for a processed whisper_hash. Results can be retrieved once.
LLMWhisperer Highlights API
GET /highlights returns per-line bounding-box coordinates (base_y, height, page, page_height) for the requested lines so callers can highlight extracted text in the source docum...
LLMWhisperer Webhooks API
Register and manage webhook callbacks via /whisper-manage-callback (POST/GET/PUT/DELETE). Submit a document with use_webhook to have the extracted result delivered to your endpo...