Hugging Face · Capability

Hugging Face Deployment and Operations

Unified workflow for deploying, scaling, and operating ML model inference endpoints on dedicated infrastructure. Combines Inference Endpoints management with TGI server monitoring. Used by ML platform engineers and DevOps teams.

Run with Naftiko Hugging FaceDeploymentOperationsInfrastructureMLOps

What You Can Do

GET
List endpoints — List all endpoints
/v1/endpoints/{namespace}
POST
Create endpoint — Create a new endpoint
/v1/endpoints/{namespace}
GET
Get endpoint — Get endpoint details
/v1/endpoints/{namespace}/{endpoint_name}
PUT
Update endpoint — Update endpoint configuration
/v1/endpoints/{namespace}/{endpoint_name}
DELETE
Delete endpoint — Delete an endpoint
/v1/endpoints/{namespace}/{endpoint_name}
POST
Pause endpoint — Pause a running endpoint
/v1/endpoints/{namespace}/{endpoint_name}/pause
POST
Resume endpoint — Resume a paused endpoint
/v1/endpoints/{namespace}/{endpoint_name}/resume
GET
Get logs — Get endpoint logs
/v1/endpoints/{namespace}/{endpoint_name}/logs
GET
Get metrics — Get endpoint metrics
/v1/endpoints/{namespace}/{endpoint_name}/metrics
GET
Health check — Check TGI server health
/v1/server/health
GET
Get info — Get TGI server info
/v1/server/info
GET
List providers — List available cloud providers
/v1/providers

MCP Tools

list-endpoints

List all dedicated inference endpoints for a namespace.

read-only
create-endpoint

Create a new dedicated inference endpoint.

get-endpoint

Get details of a specific endpoint.

read-only
update-endpoint

Update an existing endpoint configuration.

idempotent
delete-endpoint

Delete a dedicated inference endpoint.

pause-endpoint

Pause a running endpoint to stop billing.

resume-endpoint

Resume a paused endpoint.

scale-to-zero

Scale an endpoint to zero replicas.

get-endpoint-logs

Get logs for an endpoint.

read-only
get-endpoint-metrics

Get metrics for an endpoint.

read-only
list-providers

List available cloud providers and hardware options.

read-only
tgi-health-check

Check if the TGI server is healthy and responding.

read-only
tgi-server-info

Get information about the deployed model and TGI server.

read-only
tgi-metrics

Get Prometheus metrics from the TGI server.

read-only

APIs Used

hf-endpoints hf-tgi