Gemini
Google's Gemini API provides access to state-of-the-art generative AI models for text generation, multimodal understanding, code generation, and more.
APIs
Gemini REST API
REST API for accessing Gemini models for text generation, chat, embeddings, and multimodal tasks.
Gemini Python SDK
Python client library for the Gemini API.
Gemini Node.js SDK
Node.js client library for the Gemini API.
Gemini Go SDK
Go client library for the Gemini API, providing an interface for developers to integrate Google generative models into Go applications.
Gemini Java SDK
Java client library for the Gemini API, providing an interface for developers to integrate Google generative models into Java applications.
Gemini C# SDK
C# client library for the Gemini API, providing an interface for developers to integrate Google generative models into .NET applications.
Gemini Live API
Low-latency bidirectional streaming API enabling real-time voice and video interactions with Gemini models over WebSocket connections.
Gemini Interactions API
Unified interface for interacting with Gemini models and agents, simplifying state management, tool orchestration, and long-running tasks as an improved alternative to generateC...
Gemini Image Generation API
Image generation capabilities through the Gemini API, supporting text-to-image generation, image editing, and multi-turn conversational editing.
Gemini Video Generation API
Video generation capabilities through the Gemini API powered by Veo, supporting text-to-video and image-to-video generation in resolutions up to 4K.
Gemini Text-to-Speech API
Native audio generation text-to-speech capabilities through the Gemini API, supporting single and multi-speaker speech synthesis with natural language control over style, accent...
Gemini Files API
API for uploading and managing media files for use with Gemini models, supporting images, audio, video, and documents up to 2 GB per file with 20 GB per project storage.
Gemini Embeddings API
Text embedding capabilities through the Gemini API, generating vector representations for semantic search, classification, clustering, and retrieval augmented generation (RAG) a...
Gemini Batch API
Asynchronous batch processing API for submitting large volumes of Gemini API requests at 50 percent of the standard cost, with support for content generation, embeddings, and Op...
Gemini Deep Research API
Agentic research capability powered by the Interactions API that autonomously plans, executes, and synthesizes multi-step research tasks using web search and URL context to prod...
Features
Process and understand text, images, audio, video, and documents in a single model.
Define custom functions that Gemini can invoke to interact with external systems and APIs.
Generate JSON responses conforming to specified schemas for reliable data extraction.
Cache large context windows to reduce latency and cost for repeated queries.
Execute Python code in a sandboxed environment for computational tasks.
Ground model responses with real-time Google Search results for factual accuracy.
Real-time bidirectional voice and video interactions over WebSocket connections.
Generate images and videos from text prompts using Gemini and Veo models.
Native audio generation with multi-speaker support and natural language style control.
Autonomous multi-step research agent that synthesizes cited reports from web sources.
Extended reasoning capability for complex problem-solving and analysis tasks.
Use Cases
Build conversational AI assistants with multimodal understanding and function calling.
Extract structured data from documents, PDFs, and images using vision capabilities.
Generate text, images, and video content with AI for marketing and creative workflows.
Generate, explain, and debug code across multiple programming languages.
Build search systems using Gemini embeddings for semantic similarity matching.
Translate text and audio in real-time using multimodal capabilities.
Integrations
Access Gemini models through Vertex AI for enterprise-grade deployment and management.
Prototype and test Gemini API calls with the web-based development environment.
Use Gemini as a provider in LangChain for building AI application pipelines.
Integrate Gemini with Firebase for mobile and web app AI features.
Use Gemini through OpenAI-compatible API endpoints for easy migration.