Inferless Inference Endpoints API
Each deployed model exposes an auto-generated REST inference endpoint on a per-deployment host (m-..model-v1.inferless.com) accepting a KServe v2 style inputs[] payload with name, shape, datatype, and data, secured with a workspace API key as a Bearer token and billed per second of GPU compute.
Documentation
Documentation
https://docs.inferless.com/api-reference/model-endpoint/model-endpoint
APIReference
https://docs.inferless.com/api-reference/model-endpoint/test-your-model-endpoint