Overview
The Google GenAI provider gives you access to Gemini models with advanced multimodal capabilities, embeddings, and image generation through Imagen.Installation
createGoogleGenAI()
Create a Google GenAI provider instance.Options
Your Google AI API key. Defaults to
GOOGLE_GENAI_API_KEY environment variable.API version to use (e.g.,
'v1', 'v1beta'). Defaults to the latest stable version.Custom base URL for API requests.
Provide your own configured Google GenAI client instance.
Returns
GoogleGenAIProvider - Provider instance with methods to create models.
Provider Methods
chatModel()
Create a chat model instance.Model identifier. See Supported Models below.
embeddingModel()
Create an embedding model instance.Embedding model identifier.
imageModel()
Create an image generation model instance.Image model identifier.
Supported Models
Chat Models
Gemini 3 (Thinking Level)
Gemini 3 (Thinking Level)
Latest generation with thinking level control.
- gemini-3-pro - Most capable multimodal model
Gemini 2.5 (Thinking Budget)
Gemini 2.5 (Thinking Budget)
Previous generation with thinking budget control.
- gemini-2.5-pro - High capability, budget-based thinking
- gemini-2.5-flash - Fast with optional thinking
- gemini-2.5-flash-lite - Lightweight with optional thinking
Embedding Models
- text-embedding-004 - Latest text embedding model
- text-embedding-003 - Previous generation
Image Models
- imagen-3.0 - Latest image generation model
- imagen-2.0 - Previous generation
Capabilities
| Feature | Support |
|---|---|
| Chat Completion | ✓ |
| Streaming | ✓ |
| Function Calling | ✓ |
| Vision | ✓ |
| Reasoning Effort | ✓ |
| Embeddings | ✓ |
| Image Generation | ✓ |
Thinking Modes
Gemini models use two different thinking control mechanisms:Thinking Level (Gemini 3)
Used by Gemini 3 models. Simple HIGH/LOW thinking control.'minimal','low','medium'→'LOW''high','max'→'HIGH'
Gemini 3 models cannot disable thinking completely.
Thinking Budget (Gemini 2.5)
Used by Gemini 2.5 models. Fine-grained token budget control.- minimal → 1,024 tokens
- low → 4,096 tokens
- medium → 16,384 tokens
- high/max → 32,768 tokens
gemini-2.5-flash and gemini-2.5-flash-lite can disable thinking by omitting the reasoning parameter.Examples
Basic Chat
Reasoning with Thinking
Multimodal Input
Embeddings
Image Generation
Function Calling
Streaming
Custom API Version
Without Thinking (Flash models)
Error Handling
Best Practices
Choosing the Right Model
Choosing the Right Model
- Gemini 3 Pro - Best for complex multimodal tasks
- Gemini 2.5 Pro - When you need fine-grained thinking budget control
- Gemini 2.5 Flash - Fast responses with optional thinking
- Flash Lite - Lightweight tasks, cost-sensitive applications
Thinking Control
Thinking Control
- Use HIGH effort for complex reasoning in Gemini 3
- Use thinking budget for precise control in Gemini 2.5
- Disable thinking on Flash models for simple queries to save costs
Multimodal Best Practices
Multimodal Best Practices
- Gemini excels at understanding images, videos, and audio
- Provide clear instructions for multimodal tasks
- Consider using Flash models for vision-only tasks
Model Comparison
| Model | Thinking Control | Can Disable | Best For |
|---|---|---|---|
| Gemini 3 Pro | Level (HIGH/LOW) | ✗ | Complex multimodal |
| Gemini 2.5 Pro | Budget (tokens) | ✗ | Controlled reasoning |
| Gemini 2.5 Flash | Budget (tokens) | ✓ | Fast + flexible |
| Gemini 2.5 Flash Lite | Budget (tokens) | ✓ | Lightweight tasks |