Deepgram logo

Deepgram

Deepgram is an enterprise voice AI platform that provides speech-to-text, text-to-speech, and voice agent APIs powered by advanced AI models. The platform offers real-time and batch transcription through its Nova model family, natural-sounding speech synthesis through its Aura model family, and an end-to-end Voice Agent API that combines STT, LLM orchestration, and TTS into a single real-time interface.

5 APIs 16 Features
Artificial IntelligenceSpeech-To-TextText-To-SpeechTranscriptionVoice AI

APIs

Deepgram Speech-To-Text API

The Deepgram Speech-to-Text API provides accurate, fast transcription of audio content using advanced AI models. It supports both pre-recorded audio files and real-time streamin...

Deepgram Text-To-Speech API

The Deepgram Text-to-Speech API converts text into natural-sounding speech using the Aura model family. It supports both single text requests and continuous streaming text-to-sp...

Deepgram Voice Agent API

The Deepgram Voice Agent API is an end-to-end solution that combines speech-to-text, LLM orchestration, and text-to-speech into a single real-time API. It simplifies the develop...

Deepgram Audio Intelligence API

The Deepgram Audio Intelligence API provides advanced analysis capabilities for audio and text content. It offers features including sentiment analysis, summarization, topic det...

Deepgram Management API

The Deepgram Management API allows developers to programmatically manage their Deepgram account resources. It provides endpoints for creating and managing API keys, configuring ...

Features

Nova-3 STT: $0.0048/min mono, $0.0058/min multilingual
Flux STT: $0.0065/min English, $0.0078/min multilingual
Aura-1 TTS at $0.015/1k characters
Aura-2 TTS at $0.030/1k characters with studio quality
Streaming and pre-recorded transcription
Speaker diarization, smart formatting
Default 50 streaming concurrent (PAYG), 100 pre-recorded
Voice cloning on Aura models
Voice agents combining STT + LLM + TTS
30+ language support on multilingual models
WebSocket streaming for real-time STT
REST API for pre-recorded files
Audio Intelligence: summarization, topics, sentiment, entities
Custom model training (Enterprise)
Self-hosted on-prem option (Enterprise)
OAuth 2.0 + API keys

Event Specifications

Deepgram Speech-to-Text Streaming Events

The Deepgram Speech-to-Text streaming API provides real-time transcription of audio using a WebSocket connection. Audio data is sent as binary WebSocket messages and transcripti...

ASYNCAPI

Deepgram Text-to-Speech Streaming Events

The Deepgram Text-to-Speech streaming API provides real-time speech synthesis over a WebSocket connection. Text is sent as JSON messages and audio data is returned as binary Web...

ASYNCAPI

Deepgram Voice Agent Events

The Deepgram Voice Agent API is an end-to-end solution that combines speech-to-text, LLM orchestration, and text-to-speech into a single real-time WebSocket API. It simplifies b...

ASYNCAPI

Semantic Vocabularies

Deepgram Context

0 classes · 11 properties

JSON-LD

API Governance Rules

Deepgram API Rules

4 rules · 4 warnings

SPECTRAL

Deepgram API Rules

5 rules · 1 errors 4 warnings

SPECTRAL

Deepgram API Rules

4 rules · 2 errors 2 warnings

SPECTRAL

JSON Structure

Deepgram Structure

0 properties

JSON STRUCTURE

Resources

🔗
PostmanWorkspace
PostmanWorkspace
🔗
ArazzoWorkflows
ArazzoWorkflows
🔗
LinkedIn
LinkedIn
🔗
Documentation
Documentation
🔗
Documentation
Documentation
💰
Pricing
Pricing
🔑
Authentication
Authentication
📄
ChangeLog
ChangeLog
📦
SDK
SDK
📦
SDK
SDK
🔗
Website
Website
📜
PrivacyPolicy
PrivacyPolicy
📜
TermsOfService
TermsOfService
🔗
JSONLD
JSONLD
🔗
JSONSchema
JSONSchema
🔗
Vocabulary
Vocabulary
🔗
LLMsTxt
LLMsTxt

Sources

Raw ↑
aid: deepgram
url: https://raw.githubusercontent.com/api-evangelist/deepgram/refs/heads/main/apis.yml
apis:
- aid: deepgram:speech-to-text-api
  name: Deepgram Speech-To-Text API
  tags:
  - Audio
  - Speech Recognition
  - Speech-To-Text
  - Transcription
  image: https://kinlane-images.s3.amazonaws.com/shared/apis-json/apis-json-logo.jpg
  baseURL: https://api.deepgram.com
  humanURL: https://developers.deepgram.com/docs/stt/getting-started
  properties:
  - url: https://developers.deepgram.com/docs/stt/getting-started
    type: Documentation
  - url: openapi/deepgram-speech-to-text-openapi.yml
    type: OpenAPI
  - url: asyncapi/deepgram-speech-to-text-asyncapi.yml
    type: AsyncAPI
  - url: rules/deepgram-speech-to-text-api-rules.yml
    type: Rules
  - url: capabilities/deepgram-speech-to-text-api-capabilities.yml
    type: Capabilities
  - url: graphql/deepgram-graphql.md
    type: GraphQL
  description: The Deepgram Speech-to-Text API provides accurate, fast transcription
    of audio content using advanced AI models. It supports both pre-recorded audio
    files and real-time streaming audio, delivering transcripts in under 300 milliseconds.
    The API includes features such as punctuation, diarization, language detection,
    smart formatting, and support for multiple languages and audio formats.
- aid: deepgram:text-to-speech-api
  name: Deepgram Text-To-Speech API
  tags:
  - Audio
  - Speech Synthesis
  - Text-To-Speech
  - Voice
  image: https://kinlane-images.s3.amazonaws.com/shared/apis-json/apis-json-logo.jpg
  baseURL: https://api.deepgram.com
  humanURL: https://developers.deepgram.com/reference/text-to-speech-api/speak
  properties:
  - url: https://developers.deepgram.com/reference/text-to-speech-api/speak
    type: Documentation
  - url: openapi/deepgram-text-to-speech-openapi.yml
    type: OpenAPI
  - url: asyncapi/deepgram-text-to-speech-asyncapi.yml
    type: AsyncAPI
  - url: rules/deepgram-text-to-speech-api-rules.yml
    type: Rules
  - url: capabilities/deepgram-text-to-speech-api-capabilities.yml
    type: Capabilities
  description: The Deepgram Text-to-Speech API converts text into natural-sounding
    speech using the Aura model family. It supports both single text requests and
    continuous streaming text-to-speech, delivering sub-200 millisecond latency suitable
    for real-time voice agents and conversational AI applications. The API offers
    multiple voice options and is designed for enterprise-grade deployments including
    voicebots, IVR systems, and interactive voice applications.
- aid: deepgram:voice-agent-api
  name: Deepgram Voice Agent API
  tags:
  - Conversational AI
  - Real-Time
  - Voice Agent
  - Voice AI
  image: https://kinlane-images.s3.amazonaws.com/shared/apis-json/apis-json-logo.jpg
  baseURL: https://api.deepgram.com
  humanURL: https://deepgram.com/product/voice-agent-api
  properties:
  - url: https://developers.deepgram.com/docs/voice-agent/getting-started
    type: Documentation
  - url: asyncapi/deepgram-voice-agent-asyncapi.yml
    type: AsyncAPI
  description: The Deepgram Voice Agent API is an end-to-end solution that combines
    speech-to-text, LLM orchestration, and text-to-speech into a single real-time
    API. It simplifies the development of conversational voice agents by eliminating
    the need to stitch together multiple services. The API includes built-in barge-in
    detection, turn-taking prediction, function calling, and mid-session control to
    ensure smooth, natural conversations without pauses or interruptions.
- aid: deepgram:audio-intelligence-api
  name: Deepgram Audio Intelligence API
  tags:
  - Audio Intelligence
  - Sentiment Analysis
  - Summarization
  - Topic Detection
  image: https://kinlane-images.s3.amazonaws.com/shared/apis-json/apis-json-logo.jpg
  baseURL: https://api.deepgram.com
  humanURL: https://developers.deepgram.com/docs/audio-intelligence
  properties:
  - url: https://developers.deepgram.com/docs/audio-intelligence
    type: Documentation
  - url: openapi/deepgram-speech-to-text-openapi.yml
    type: OpenAPI
  description: The Deepgram Audio Intelligence API provides advanced analysis capabilities
    for audio and text content. It offers features including sentiment analysis, summarization,
    topic detection, and intent recognition. These capabilities allow developers to
    extract structured insights from transcribed audio or text input, enabling use
    cases such as call center analytics, meeting summarization, and content categorization.
- aid: deepgram:management-api
  name: Deepgram Management API
  tags:
  - Administration
  - API Keys
  - Management
  - Projects
  image: https://kinlane-images.s3.amazonaws.com/shared/apis-json/apis-json-logo.jpg
  baseURL: https://api.deepgram.com
  humanURL: https://developers.deepgram.com/docs/create-additional-api-keys
  properties:
  - url: https://developers.deepgram.com/docs/create-additional-api-keys
    type: Documentation
  - url: openapi/deepgram-management-openapi.yml
    type: OpenAPI
  - url: rules/deepgram-management-api-rules.yml
    type: Rules
  - url: capabilities/deepgram-management-api-capabilities.yml
    type: Capabilities
  description: The Deepgram Management API allows developers to programmatically manage
    their Deepgram account resources. It provides endpoints for creating and managing
    API keys, configuring projects, managing team members, and monitoring usage. This
    API enables automation of administrative tasks and integration of Deepgram account
    management into existing workflows and infrastructure tooling.
name: Deepgram
tags:
- Artificial Intelligence
- Speech-To-Text
- Text-To-Speech
- Transcription
- Voice AI
kind: company
image: https://kinlane-images.s3.amazonaws.com/shared/apis-json/apis-json-logo.jpg
access: 3rd-Party
common:
- type: PostmanWorkspace
  url: https://www.postman.com/kinlaneapi/deepgram/overview
- type: ArazzoWorkflows
  url: arazzo/
  workflows:
  - url: arazzo/deepgram-audit-project-usage-workflow.yml
    name: Deepgram Audit Project Usage and Billing
    summary: Pull a project's details, usage summary, request log, and billing balances
      into a single audit snapshot.
  - url: arazzo/deepgram-balance-gated-transcription-workflow.yml
    name: Deepgram Balance-Gated Transcription
    summary: Check a project's billing balance and only transcribe audio when sufficient
      credit remains.
  - url: arazzo/deepgram-diarized-transcript-intelligence-workflow.yml
    name: Deepgram Diarized Transcript Intelligence
    summary: Transcribe a multi-speaker recording with diarization, then run targeted
      text intelligence over the transcript.
  - url: arazzo/deepgram-invite-and-confirm-member-workflow.yml
    name: Deepgram Invite and Confirm a Project Member
    summary: Send a project invitation, confirm it appears in the pending invitation
      list, and list current members.
  - url: arazzo/deepgram-provision-project-key-workflow.yml
    name: Deepgram Provision a Project API Key
    summary: Create a new project, mint a scoped API key for it, and verify the key
      appears in the project key list.
  - url: arazzo/deepgram-rotate-project-key-workflow.yml
    name: Deepgram Rotate a Project API Key
    summary: Mint a replacement API key in a project, verify it, then delete the old
      key to complete a rotation.
  - url: arazzo/deepgram-select-model-and-transcribe-workflow.yml
    name: Deepgram Select a Model and Transcribe
    summary: Browse available models, read the metadata for a chosen model, then transcribe
      audio with that model.
  - url: arazzo/deepgram-transcribe-analyze-synthesize-workflow.yml
    name: Deepgram Transcribe, Analyze, and Synthesize
    summary: Transcribe audio to text, run text intelligence on the transcript, then
      synthesize a spoken response.
  - url: arazzo/deepgram-transcribe-and-track-usage-workflow.yml
    name: Deepgram Transcribe Audio and Track Usage
    summary: Transcribe a pre-recorded audio URL and then reconcile the request against
      project usage and request logs.
  - url: arazzo/deepgram-update-member-scopes-workflow.yml
    name: Deepgram Update a Member's Scopes
    summary: Locate a project member, read their current scopes, update them, and
      confirm the new scopes took effect.
- type: LinkedIn
  url: https://www.linkedin.com/company/deepgram
- url: https://developers.deepgram.com/home
  name: Developer Portal
  type: Documentation
- url: https://developers.deepgram.com/reference/deepgram-api-overview
  name: API Reference
  type: Documentation
- url: https://deepgram.com/pricing
  name: Pricing
  type: Pricing
- url: https://developers.deepgram.com/docs/authenticating
  name: Authentication
  type: Authentication
- url: https://developers.deepgram.com/changelog
  name: Changelog
  type: ChangeLog
- url: https://github.com/deepgram/deepgram-python-sdk
  name: Python SDK
  type: SDK
- url: https://github.com/deepgram/deepgram-js-sdk
  name: JavaScript SDK
  type: SDK
- url: https://deepgram.com/
  name: Deepgram
  type: Website
- url: https://deepgram.com/privacy
  name: Privacy Policy
  type: PrivacyPolicy
- url: https://deepgram.com/tos
  name: Terms of Service
  type: TermsOfService
- url: json-ld/deepgram-context.jsonld
  type: JSONLD
- url: json-schema/deepgram-transcript-schema.json
  type: JSONSchema
- url: vocabulary/deepgram-vocabulary.yml
  type: Vocabulary
- type: Features
  data:
  - 'Nova-3 STT: $0.0048/min mono, $0.0058/min multilingual'
  - 'Flux STT: $0.0065/min English, $0.0078/min multilingual'
  - Aura-1 TTS at $0.015/1k characters
  - Aura-2 TTS at $0.030/1k characters with studio quality
  - Streaming and pre-recorded transcription
  - Speaker diarization, smart formatting
  - Default 50 streaming concurrent (PAYG), 100 pre-recorded
  - Voice cloning on Aura models
  - Voice agents combining STT + LLM + TTS
  - 30+ language support on multilingual models
  - WebSocket streaming for real-time STT
  - REST API for pre-recorded files
  - 'Audio Intelligence: summarization, topics, sentiment, entities'
  - Custom model training (Enterprise)
  - Self-hosted on-prem option (Enterprise)
  - OAuth 2.0 + API keys
  sources:
  - https://deepgram.com/pricing
  updated: '2026-05-04'
- type: Integrations
  url: https://deepgram.com/partners
- type: LLMsTxt
  url: https://deepgram.com/llms.txt
created: '2026-03-20'
modified: '2026-05-19'
position: Consuming
description: Deepgram is an enterprise voice AI platform that provides speech-to-text,
  text-to-speech, and voice agent APIs powered by advanced AI models. The platform
  offers real-time and batch transcription through its Nova model family, natural-sounding
  speech synthesis through its Aura model family, and an end-to-end Voice Agent API
  that combines STT, LLM orchestration, and TTS into a single real-time interface.
integrations:
- name: Technology
- name: Development
- name: Vonage Technology
- name: Cloudflare Technology
- name: Daily.co Technology
- name: Stream Technology
- name: Kore Technology
- name: Google Cloud Technology
- name: AudioCodes Technology
- name: Vida Technology
- name: Recall.ai Technology
- name: Porter Technology
- name: Perlon AI Development
- name: OneSix Solutions Development
- name: Lumio AI Development
- name: LucidPoint Development
- name: Lindy Technology
- name: InfoCap Development
- name: Five9 Technology
- name: Caylent Development
- name: APrime Development
- name: AI Heroes Development
- name: AICG Technology
- name: Deepgram & Vercel Next.js Templates Technology
- name: AWS Technology
- name: Abby Connect Technology
- name: Voximplant Technology
- name: Cognigy Technology
- name: Enterprise Bot Technology
maintainers:
- FN: Kin Lane
  email: kin@apievangelist.com
specificationVersion: '0.19'