Hugging Face · Capability

Hugging Face Deployment and Operations

Unified workflow for deploying, scaling, and operating ML model inference endpoints on dedicated infrastructure. Combines Inference Endpoints management with TGI server monitoring. Used by ML platform engineers and DevOps teams.

Run with Naftiko Hugging FaceDeploymentOperationsInfrastructureMLOps

What You Can Do

GET

List endpoints — List all endpoints

/v1/endpoints/{namespace}

POST

Create endpoint — Create a new endpoint

/v1/endpoints/{namespace}

GET

Get endpoint — Get endpoint details

/v1/endpoints/{namespace}/{endpoint_name}

PUT

Update endpoint — Update endpoint configuration

/v1/endpoints/{namespace}/{endpoint_name}

DELETE

Delete endpoint — Delete an endpoint

/v1/endpoints/{namespace}/{endpoint_name}

POST

Pause endpoint — Pause a running endpoint

/v1/endpoints/{namespace}/{endpoint_name}/pause

POST

Resume endpoint — Resume a paused endpoint

/v1/endpoints/{namespace}/{endpoint_name}/resume

GET

Get logs — Get endpoint logs

/v1/endpoints/{namespace}/{endpoint_name}/logs

GET

Get metrics — Get endpoint metrics

/v1/endpoints/{namespace}/{endpoint_name}/metrics

GET

Health check — Check TGI server health

/v1/server/health

GET

Get info — Get TGI server info

/v1/server/info

GET

List providers — List available cloud providers

/v1/providers

MCP Tools

list-endpoints

List all dedicated inference endpoints for a namespace.

read-only

create-endpoint

Create a new dedicated inference endpoint.

get-endpoint

Get details of a specific endpoint.

read-only

update-endpoint

Update an existing endpoint configuration.

idempotent

delete-endpoint

Delete a dedicated inference endpoint.

pause-endpoint

Pause a running endpoint to stop billing.

resume-endpoint

Resume a paused endpoint.

scale-to-zero

Scale an endpoint to zero replicas.

get-endpoint-logs

Get logs for an endpoint.

read-only

get-endpoint-metrics

Get metrics for an endpoint.

read-only

list-providers

List available cloud providers and hardware options.

read-only

tgi-health-check

Check if the TGI server is healthy and responding.

read-only

tgi-server-info

Get information about the deployed model and TGI server.

read-only

tgi-metrics

Get Prometheus metrics from the TGI server.

read-only

APIs Used

hf-endpoints hf-tgi