Replicate logo

Replicate

Replicate lets you run machine learning models in the cloud with a simple API. Thousands of open-source models are available, and you can run your own custom models at scale. Run image generation, language models, audio synthesis, video generation, and more with a few lines of code. Replicate makes AI accessible to every software engineer.

1 APIs 16 Features
Artificial IntelligenceMachine LearningImage GenerationLanguage ModelsModel Deployment

APIs

Replicate

Replicate lets you run machine learning models in the cloud with a simple REST API. Access thousands of open-source models for image generation, language modeling, audio synthes...

Features

T4 GPU at $0.000225/sec (cheapest)
L40S GPU at $0.000975/sec
A100 80GB at $0.00140/sec
H100 at $0.001525/sec (highest performance)
Pay only for execution time (per second)
Default 10 predictions/sec; can be raised to 100 on paid
Other endpoints: 60 req/sec
Public model library with thousands of models
Cog framework for packaging your own models
Deployments for low-latency inference (charges idle time)
Webhooks for prediction completion
OAuth 2.0 and API tokens
Streaming output for LLM models
Files input via signed URLs
Training service for fine-tuning
Trainings billed at hardware rate

Event Specifications

Replicate Streaming and Webhooks API

AsyncAPI definition for Replicate's event-driven surfaces: - Server-Sent Events (SSE) stream returned for predictions where the model supports streaming output. The stream URL i...

ASYNCAPI

Semantic Vocabularies

Replicate Context

2 classes · 32 properties

JSON-LD

API Governance Rules

Replicate API Rules

10 rules · 2 errors 6 warnings

SPECTRAL

Resources

🔗
PostmanWorkspace
PostmanWorkspace
🔗
ArazzoWorkflows
ArazzoWorkflows
🔗
LinkedIn
LinkedIn
🔗
Website
Website
🔗
Documentation
Documentation
💰
Pricing
Pricing
📰
Blog
Blog
📄
ChangeLog
ChangeLog
📜
TermsOfService
TermsOfService
📜
PrivacyPolicy
PrivacyPolicy
📝
SignUp
SignUp
🔗
Login
Login
🔗
Playground
Playground
👥
GitHubOrganization
GitHubOrganization
📦
SDKs
SDKs
📦
Python SDK
Python SDK
📦
Node.js SDK
Node.js SDK
📦
Go SDK
Go SDK
📦
Swift SDK
Swift SDK
🔗
Cog
Cog
🟢
StatusPage
StatusPage
🔗
MCPServer
MCPServer
🔗
AgentSkill
AgentSkill
🔗
LLMsTxt
LLMsTxt

Sources

Raw ↑
aid: replicate
name: Replicate
description: >-
  Replicate lets you run machine learning models in the cloud with a simple API. Thousands of open-source models are
  available, and you can run your own custom models at scale. Run image generation, language models, audio synthesis,
  video generation, and more with a few lines of code. Replicate makes AI accessible to every software engineer.
type: Index
image: https://kinlane-images.s3.amazonaws.com/shared/apis-json/apis-json-logo.jpg
tags:
  - Artificial Intelligence
  - Machine Learning
  - Image Generation
  - Language Models
  - Model Deployment
url: https://raw.githubusercontent.com/api-evangelist/replicate/refs/heads/main/apis.yml
created: '2024-11-13'
modified: '2026-05-29'
specificationVersion: '0.19'
apis:
  - aid: replicate:replicate
    name: Replicate
    description: >-
      Replicate lets you run machine learning models in the cloud with a simple REST API. Access thousands of
      open-source models for image generation, language modeling, audio synthesis, video generation, upscaling, and
      more. Create predictions, manage deployments, fine-tune models, and run training jobs via a clean API with
      webhooks and streaming support.
    humanURL: https://replicate.com/
    tags:
      - Accounts
      - Artificial Intelligence
      - Collections
      - Deployments
      - Hardware
      - Machine Learning
      - Models
      - Predictions
      - Training
      - Webhooks
    properties:
      - url: https://replicate.com/docs
        type: Documentation
      - url: https://replicate.com/docs/reference/http
        type: OpenAPI Documentation
      - url: openapi/replicate-openapi.yml
        type: OpenAPI
      - url: asyncapi/replicate-asyncapi.yml
        type: AsyncAPI
      - url: rules/replicate-rules.yml
        type: SpectralRuleset
      - url: vocabulary/replicate-vocabulary.yml
        type: Vocabulary
common:
  - type: PostmanWorkspace
    url: https://www.postman.com/kinlaneapi/replicate/overview
  - type: ArazzoWorkflows
    url: arazzo/
    workflows:
      - url: arazzo/replicate-collection-predict-workflow.yml
        name: Replicate Pick a Model from a Collection and Predict
        summary: Read a curated collection, confirm a chosen model, run its latest version, and poll the prediction.
      - url: arazzo/replicate-deploy-and-predict-workflow.yml
        name: Replicate Create a Deployment and Run a Prediction Through It
        summary: Pick hardware, create a deployment for a model version, then run a prediction via the deployment.
      - url: arazzo/replicate-model-version-predict-workflow.yml
        name: Replicate Resolve Latest Version and Predict
        summary: Look up a model, pick its latest version, run a prediction, and poll to completion.
      - url: arazzo/replicate-official-model-predict-workflow.yml
        name: Replicate Run an Official Model and Poll
        summary: Run a prediction against an official model by name, then poll until complete.
      - url: arazzo/replicate-predict-and-poll-workflow.yml
        name: Replicate Create Prediction and Poll Until Complete
        summary: Run a model version, then poll the prediction until it reaches a terminal state.
      - url: arazzo/replicate-predict-with-timeout-cancel-workflow.yml
        name: Replicate Run a Prediction with Bounded Wait and Cancel
        summary: Create a prediction, poll a bounded number of times, and cancel it if it has not finished.
      - url: arazzo/replicate-scale-deployment-and-predict-workflow.yml
        name: Replicate Scale a Deployment and Run a Prediction
        summary: Read a deployment, update its version and instance bounds, then run a prediction through it.
      - url: arazzo/replicate-search-model-and-predict-workflow.yml
        name: Replicate Search for a Model and Run a Prediction
        summary: Search public models by query, run a prediction on the top match's latest version, and poll it.
      - url: arazzo/replicate-train-model-and-poll-workflow.yml
        name: Replicate Start a Training and Poll Until Complete
        summary: Start a fine-tuning run from a base model version, then poll until the training finishes.
      - url: arazzo/replicate-webhook-secured-predict-workflow.yml
        name: Replicate Fetch Webhook Secret and Run a Webhook Prediction
        summary: Retrieve the default webhook signing secret, then create a prediction that posts to a webhook.
  - type: LinkedIn
    url: https://www.linkedin.com/company/replicate
  - type: Website
    url: https://replicate.com
  - type: Documentation
    url: https://replicate.com/docs
  - type: Pricing
    url: https://replicate.com/pricing
  - type: Blog
    url: https://replicate.com/blog
  - type: ChangeLog
    url: https://replicate.com/changelog
  - type: TermsOfService
    url: https://replicate.com/terms
  - type: PrivacyPolicy
    url: https://replicate.com/privacy
  - type: SignUp
    url: https://replicate.com/signin?next=/docs
  - type: Login
    url: https://replicate.com/signin
  - type: Playground
    url: https://replicate.com/explore
  - type: GitHubOrganization
    url: https://github.com/replicate
  - type: SDKs
    url: https://replicate.com/docs/reference/client-libraries
  - type: Python SDK
    url: https://github.com/replicate/replicate-python
  - type: Node.js SDK
    url: https://github.com/replicate/replicate-javascript
  - type: Go SDK
    url: https://github.com/replicate/replicate-go
  - type: Swift SDK
    url: https://github.com/replicate/replicate-swift
  - type: Cog
    url: https://github.com/replicate/cog
  - type: StatusPage
    url: https://status.replicate.com
  - type: Features
    data:
      - T4 GPU at $0.000225/sec (cheapest)
      - L40S GPU at $0.000975/sec
      - A100 80GB at $0.00140/sec
      - H100 at $0.001525/sec (highest performance)
      - Pay only for execution time (per second)
      - Default 10 predictions/sec; can be raised to 100 on paid
      - 'Other endpoints: 60 req/sec'
      - Public model library with thousands of models
      - Cog framework for packaging your own models
      - Deployments for low-latency inference (charges idle time)
      - Webhooks for prediction completion
      - OAuth 2.0 and API tokens
      - Streaming output for LLM models
      - Files input via signed URLs
      - Training service for fine-tuning
      - Trainings billed at hardware rate
    sources:
      - https://replicate.com/pricing
    updated: '2026-05-04'
  - name: MCP Server
    url: https://github.com/replicate/replicate-mcp-code-mode
    type: MCPServer
  - name: Agent Skills
    url: https://github.com/replicate/skills
    type: AgentSkill
  - type: LLMsTxt
    url: https://replicate.com/llms.txt
maintainers:
  - FN: Kin Lane
    email: kin@apievangelist.com