LLMWhisperer

LLMWhisperer is a document-to-text extraction API from Unstract (Zipstack) that turns complex PDFs, scanned documents, and images into clean, layout-preserving text ready for large language models. It exposes an asynchronous REST API (v2) - submit a document to /whisper, poll /whisper-status, then retrieve the extracted text via /whisper-retrieve - plus line-level highlight coordinates and webhook callbacks. Authentication is via the unstract-key header.

5 APIs 0 Features

AILLMDocument ExtractionOCRText Extraction

APIs

LLMWhisperer Whisper Extraction API

Submits a document (PDF, image, or URL) to POST /whisper for asynchronous, layout-preserving text extraction across native_text, low_cost, high_quality, form, and table modes. R...

LLMWhisperer Whisper Status API

GET /whisper-status returns the processing state (accepted, processing, processed, error, retrieved) for a whisper_hash, with per-page execution detail.

LLMWhisperer Whisper Retrieve API

GET /whisper-retrieve returns the extracted result_text plus optional confidence_metadata for a processed whisper_hash. Results can be retrieved once.

LLMWhisperer Highlights API

GET /highlights returns per-line bounding-box coordinates (base_y, height, page, page_height) for the requested lines so callers can highlight extracted text in the source docum...

LLMWhisperer Webhooks API

Register and manage webhook callbacks via /whisper-manage-callback (POST/GET/PUT/DELETE). Submit a document with use_webhook to have the extracted result delivered to your endpo...

Collections

LLMWhisperer API

OPEN

Pricing Plans

Llmwhisperer Plans Pricing

3 plans

PLANS

Rate Limits

Llmwhisperer Rate Limits

3 limits

RATE LIMITS

FinOps

Llmwhisperer Finops

FINOPS

Resources

👥

GitHubOrganization

Sources

opencollection: 1.0.0
info:
  name: LLMWhisperer API
  version: '2.0'
request:
  auth:
    type: apikey
    apikey:
      key: unstract-key
      value: '{{unstractKey}}'
      in: header
items:
- info:
    name: Extraction
    type: folder
  items:
  - info:
      name: Submit a document for text extraction
      type: http
    http:
      method: POST
      url: https://llmwhisperer-api.us-central.unstract.com/api/v2/whisper?mode=form&output_mode=layout_preserving&page_seperator=<<<&lang=eng&tag=default&add_line_nos=false&url_in_post=false
      body:
        type: binary
        data: ''
    docs: Converts a document to text asynchronously. Send the document as binary (or a URL when url_in_post=true). Returns
      202 with a whisper_hash.
- info:
    name: Status
    type: folder
  items:
  - info:
      name: Check the status of a whisper job
      type: http
    http:
      method: GET
      url: https://llmwhisperer-api.us-central.unstract.com/api/v2/whisper-status?
    docs: Returns the processing state (accepted, processing, processed, error, retrieved) for a whisper_hash.
- info:
    name: Retrieve
    type: folder
  items:
  - info:
      name: Retrieve the extracted text
      type: http
    http:
      method: GET
      url: https://llmwhisperer-api.us-central.unstract.com/api/v2/whisper-retrieve?text_only=false
    docs: Returns result_text and optional metadata for a processed whisper_hash. Retrievable once.
- info:
    name: Highlights
    type: folder
  items:
  - info:
      name: Retrieve per-line bounding-box coordinates
      type: http
    http:
      method: GET
      url: https://llmwhisperer-api.us-central.unstract.com/api/v2/highlights?lines=1-5,7,21-
    docs: Returns per-line bounding-box metadata for the requested lines.
- info:
    name: Webhooks
    type: folder
  items:
  - info:
      name: Register a webhook callback
      type: http
    http:
      method: POST
      url: https://llmwhisperer-api.us-central.unstract.com/api/v2/whisper-manage-callback
      body:
        type: json
        data: "{\n  \"url\": \"https://example.com/webhook\",\n  \"auth_token\": \"\",\n  \"webhook_name\": \"my-webhook\"\
          \n}"
    docs: Registers a webhook that receives the extracted result on completion.
  - info:
      name: Retrieve a registered webhook
      type: http
    http:
      method: GET
      url: https://llmwhisperer-api.us-central.unstract.com/api/v2/whisper-manage-callback?webhook_name=my-webhook
    docs: Retrieves details for a registered webhook.
  - info:
      name: Update a registered webhook
      type: http
    http:
      method: PUT
      url: https://llmwhisperer-api.us-central.unstract.com/api/v2/whisper-manage-callback
      body:
        type: json
        data: "{\n  \"url\": \"https://example.com/webhook\",\n  \"auth_token\": \"\",\n  \"webhook_name\": \"my-webhook\"\
          \n}"
    docs: Updates a registered webhook.
  - info:
      name: Delete a registered webhook
      type: http
    http:
      method: DELETE
      url: https://llmwhisperer-api.us-central.unstract.com/api/v2/whisper-manage-callback?webhook_name=my-webhook
    docs: Deletes a registered webhook.