RunPod Serverless

RunPod Serverless provides pay-as-you-go inference endpoints with autoscaling workers, queue-based and load-balanced endpoint types, FlashBoot cold-start optimization, and per-second billing. Each endpoint exposes a URL that accepts request payloads for AI model inference and compute-intensive workloads.

API entry from apis.yml

apis.yml Raw ↑
aid: runpod:serverless
name: RunPod Serverless
description: RunPod Serverless provides pay-as-you-go inference endpoints with autoscaling workers, queue-based
  and load-balanced endpoint types, FlashBoot cold-start optimization, and per-second billing. Each endpoint
  exposes a URL that accepts request payloads for AI model inference and compute-intensive workloads.
humanURL: https://docs.runpod.io/serverless/overview
baseURL: https://api.runpod.ai/v2
tags:
- AI
- Autoscaling
- GPU
- Inference
- Serverless
- Workers
properties:
- type: Documentation
  url: https://docs.runpod.io/serverless/overview