CoreWeave Inference API

The CoreWeave Inference API manages Deployments, Gateways, and Capacity Claims for serverless and dedicated AI inference. It is used to create, update, and route to managed model deployments backed by CoreWeave's GPU fleet.

API entry from apis.yml

apis.yml Raw ↑
aid: coreweave:inference-api
name: CoreWeave Inference API
description: The CoreWeave Inference API manages Deployments, Gateways, and Capacity Claims for serverless
  and dedicated AI inference. It is used to create, update, and route to managed model deployments backed
  by CoreWeave's GPU fleet.
humanURL: https://docs.coreweave.com/products/inference/reference/api-overview
tags:
- AI
- Deployments
- Gateways
- Inference
- Models
properties:
- type: Documentation
  url: https://docs.coreweave.com/products/inference/reference/api-overview
- type: Reference
  url: https://docs.coreweave.com/products/inference/reference/deploymentservice/create-deployment
- type: Reference
  url: https://docs.coreweave.com/products/inference/reference/gatewayservice/create-gateway
- type: Reference
  url: https://docs.coreweave.com/products/inference/reference/capacityclaimservice/create-capacity-claim