Hugging Face · Capability

Hugging Face Model Inference

Unified workflow for running AI/ML inference across Hugging Face APIs, combining the Inference API, Inference Providers, and Text Generation Inference for NLP, vision, audio, and multimodal tasks. Used by ML engineers and AI application developers.

Run with Naftiko Hugging FaceMachine LearningInferenceAIText Generation

What You Can Do

POST

Run inference — Run inference on a model via the Inference API

/v1/inference/{model_id}

POST

Text generation — Generate text using the Inference API

/v1/text-generation/{model_id}

POST

Chat completion — Create chat completion via providers

/v1/chat/completions

POST

Text completion — Create text completion via providers

/v1/completions

POST

Create embeddings — Create text embeddings

/v1/embeddings

POST

Generate image — Generate images from text

/v1/images/generations

POST

Classify text — Classify text into categories

/v1/classification/{model_id}

POST

Summarize — Summarize text content

/v1/summarization/{model_id}

POST

Translate — Translate text between languages

/v1/translation/{model_id}

MCP Tools

run-inference

Run inference on any Hugging Face model by model ID.

read-only

generate-text

Generate text using a language model via the Inference API.

read-only

classify-text

Classify text into predefined categories.

read-only

answer-question

Answer questions based on provided context.

read-only

summarize-text

Summarize text content.

read-only

translate-text

Translate text between languages.

read-only

fill-mask

Fill in masked tokens in text.

read-only

extract-features

Extract feature vectors from text for embeddings.

read-only

classify-image

Classify images into categories.

read-only

detect-objects

Detect objects in images.

read-only

transcribe-speech

Transcribe audio to text using automatic speech recognition.

read-only

generate-image

Generate images from text prompts.

read-only

zero-shot-classify

Classify text without predefined training labels.

read-only

compute-similarity

Compute similarity between sentences.

read-only

providers-chat-completion

Create chat completion via OpenAI-compatible multi-provider API.

read-only

providers-text-completion

Create text completion via multi-provider API.

read-only

providers-create-embeddings

Create text embeddings via multi-provider API.

read-only

providers-generate-image

Generate images via multi-provider API.

read-only

providers-transcribe

Transcribe audio via multi-provider API.

read-only

providers-text-to-speech

Convert text to speech via multi-provider API.

read-only

tgi-generate

Generate text using the TGI native endpoint.

read-only

tgi-chat-completions

Create chat completions using TGI OpenAI-compatible Messages API.

read-only

tgi-tokenize

Tokenize input text and return token IDs.

read-only

tgi-server-info

Get TGI server information and deployed model details.

read-only

list-provider-models

List models available across all inference providers.

read-only

APIs Used

hf-inference hf-providers hf-tgi