--- sidebar_label: Google AI / Gemini description: Configure Google's Gemini models with support for text, images, and video inputs through Google AI Studio API for comprehensive multimodal LLM testing and evaluation --- # Google AI / Gemini The `google` provider enables integration with Google AI Studio and the Gemini API. It provides access to Google's Gemini and hosted Gemma models with support for text, images, and video inputs. If you are using Vertex AI instead of Google AI Studio, see the [`vertex` provider](/docs/providers/vertex). ## Authentication To use the Google AI Studio API, you need to authenticate using an API key. Follow these steps: ### 1. Get an API Key 1. Visit [Google AI Studio](https://aistudio.google.com/) 2. Click on "Get API key" in the left sidebar 3. Create a new API key or use an existing one 4. Copy your API key **Security Note:** Never commit API keys to version control. Always use environment variables or a `.env` file that's added to `.gitignore`. ### 2. Configure Authentication You have three options for providing your API key: #### Option 1: Environment Variable (Recommended) Set the `GOOGLE_API_KEY` environment variable: ```bash # Using export (Linux/macOS) export GOOGLE_API_KEY="your_api_key_here" # Using set (Windows Command Prompt) set GOOGLE_API_KEY=your_api_key_here # Using $env (Windows PowerShell) $env:GOOGLE_API_KEY="your_api_key_here" ``` #### Option 2: .env File (Recommended for Development) Create a `.env` file in your project root: ```bash # .env GOOGLE_API_KEY=your_api_key_here ``` Promptfoo automatically loads environment variables from `.env` files in your project directory. Make sure to add `.env` to your `.gitignore` file. #### Option 3: Provider Configuration Specify the API key directly in your configuration: ```yaml providers: - id: google:gemini-2.5-flash config: apiKey: your_api_key_here ``` **Note:** Avoid hardcoding API keys in configuration files that might be committed to version control. The API key is automatically detected from the `GOOGLE_API_KEY` environment variable, so you typically don't need to specify it in the config. If you need to explicitly reference an environment variable in your config, use Nunjucks template syntax: ```yaml providers: - id: google:gemini-2.5-flash # Uses GOOGLE_API_KEY env var config: # apiKey: "{{ env.GOOGLE_API_KEY }}" # optional, auto-detected temperature: 0.7 ``` ### 3. Verify Authentication Test your setup with a simple prompt: ```bash promptfoo eval --prompt "Hello, how are you?" --providers google:gemini-2.5-flash ``` ## Configuration Options In addition to authentication, you can configure: - `GOOGLE_API_HOST` - Override the Google API host (defaults to `generativelanguage.googleapis.com`) - `GOOGLE_API_BASE_URL` - Override the Google API base URL (defaults to `https://generativelanguage.googleapis.com`) Example with custom host: ```yaml providers: - id: google:gemini-2.5-flash config: apiHost: custom.googleapis.com apiBaseUrl: https://custom.googleapis.com ``` For promptfoo's built-in cost estimates, Google providers also support `config.cost`, `config.inputCost`, and `config.outputCost`. Use `inputCost` and `outputCost` for separate prompt and completion pricing. The legacy `cost` option remains the shared fallback. ## Quick Start ### 1. Basic Evaluation Create a simple `promptfooconfig.yaml`: ```yaml # promptfooconfig.yaml providers: - google:gemini-2.5-flash prompts: - 'Write a haiku about {{topic}}' tests: - vars: topic: 'artificial intelligence' - vars: topic: 'the ocean' ``` Run the eval: ```bash promptfoo eval ``` ### 2. Comparing Models Compare different Gemini and Gemma models: ```yaml providers: - google:gemma-4-31b-it - google:gemini-2.5-flash - google:gemini-2.5-pro - google:gemini-2.0-flash prompts: - 'Explain {{concept}} in simple terms' tests: - vars: concept: 'quantum computing' assert: - type: contains value: 'qubit' - type: llm-rubric value: 'The explanation should be understandable by a high school student' ``` ### 3. Using Environment Variables ```yaml # Reference environment variables in your config providers: - id: google:gemini-2.5-flash # Uses GOOGLE_API_KEY env var config: # apiKey: "{{ env.GOOGLE_API_KEY }}" # optional, auto-detected temperature: '{{ env.TEMPERATURE | default(0.7) }}' # Default to 0.7 if not set ``` ## Troubleshooting ### Common Issues #### 1. API Key Not Found **Error**: `API key not found` **Solution**: Ensure your API key is properly set: ```bash # Check if the environment variable is set echo $GOOGLE_API_KEY # If empty, set it again export GOOGLE_API_KEY="your_api_key_here" ``` #### 2. Invalid API Key **Error**: `API key not valid. Please pass a valid API key` **Solutions**: - Verify your API key at [Google AI Studio](https://aistudio.google.com/) - Ensure you're using the correct API key (not a project ID or other credential) - Check that your API key has the necessary permissions #### 3. Rate Limiting **Error**: `Resource has been exhausted` **Solutions**: - Add delays between requests: ```yaml # promptfooconfig.yaml evaluateOptions: delay: 1000 # 1 second delay between API calls ``` - Upgrade your API quota in Google AI Studio - Use a lower rate tier model like `gemini-2.5-flash-lite` #### 4. Model Not Available **Error**: `Model not found` **Solutions**: - Check the model name spelling - Ensure the model is available in your region - Verify the model is listed in the [available models](https://ai.google.dev/models) ### Debugging Tips 1. **Enable verbose logging**: ```bash promptfoo eval --verbose ``` 2. **Test your API key directly**: ```bash curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \ -H "Content-Type: application/json" \ -d '{"contents":[{"parts":[{"text":"Hello"}]}]}' ``` 3. **Check your environment**: ```bash # List all GOOGLE_ environment variables env | grep GOOGLE_ ``` ## Migration Guide ### Migrating from Google AI Studio to Vertex AI If you need more advanced features or enterprise capabilities, you can migrate to Vertex AI: | Google AI Studio | Vertex AI | Notes | | ------------------------- | ----------------------------- | --------------------------------------- | | `google:gemini-2.5-flash` | `vertex:gemini-2.5-flash` | Same model, different endpoint | | `GOOGLE_API_KEY` | `GOOGLE_CLOUD_PROJECT` + auth | Vertex uses Google Cloud authentication | | Simple API key | Multiple auth methods | Vertex supports ADC, service accounts | | Global endpoint | Regional endpoints | Vertex requires region selection | Example migration: ```yaml # Before (Google AI Studio) providers: - google:gemini-2.5-pro # After (Vertex AI) providers: - vertex:gemini-2.5-pro config: projectId: my-project-id region: us-central1 ``` See the [Vertex AI provider documentation](/docs/providers/vertex) for detailed setup instructions. ## Available Models ### Chat and Multimodal Models - `google:gemma-4-31b-it` - Gemma 4 31B instruction-tuned open model with strong reasoning, coding, and agentic capabilities - `google:gemma-4-26b-a4b-it` - Gemma 4 26B A4B instruction-tuned open model for lower-latency reasoning and coding evals - `google:gemini-3.1-pro-preview` - Gemini 3.1 Pro preview with improved reasoning and performance ($2/1M input, $12/1M output; $4/$18 above 200K) - `google:gemini-3.1-pro-preview-customtools` - Gemini 3.1 Pro preview variant for custom tools with the same pricing as Gemini 3.1 Pro - `google:gemini-3.1-flash-lite-preview` - Gemini 3.1 Flash-Lite preview optimized for high-volume, low-latency tasks ($0.25/1M text/image/video input, $1.50/1M output) - `google:gemini-3-flash-preview` - Gemini 3.0 Flash preview with frontier intelligence, Pro-grade reasoning at Flash-level speed, thinking, and grounding ($0.50/1M input, $3/1M output) - `google:gemini-2.5-pro` - Gemini 2.5 Pro model with enhanced reasoning, coding, and multimodal understanding - `google:gemini-2.5-flash` - Gemini 2.5 Flash model with enhanced reasoning and thinking capabilities - `google:gemini-2.5-flash-lite` - Cost-efficient Gemini 2.5 model optimized for high-volume, latency-sensitive tasks - `google:gemini-flash-latest` - Google-maintained alias for the latest Gemini Flash release - `google:gemini-flash-lite-latest` - Google-maintained alias for the latest Gemini Flash-Lite release - `google:gemini-2.0-flash` - Multimodal model with next-gen features, 1M token context window - `google:gemini-2.0-flash-lite` - Cost-efficient version of 2.0 Flash with 1M token context ### Embedding Models Use the `google:embedding:` prefix (or the plural `google:embeddings:` alias) to call the Gemini API `embedContent` endpoint: - `google:embedding:gemini-embedding-001` - Recommended default. Multilingual plus code, up to 3,072 dimensions, 2,048 input-token limit Optional config keys (forwarded as documented in Google's [embedContent reference](https://ai.google.dev/api/embeddings#EmbedContentRequest)): - `taskType` - one of `SEMANTIC_SIMILARITY`, `CLASSIFICATION`, `CLUSTERING`, `RETRIEVAL_DOCUMENT`, `RETRIEVAL_QUERY`, `QUESTION_ANSWERING`, `FACT_VERIFICATION`, `CODE_RETRIEVAL_QUERY` - `outputDimensionality` - truncates the returned vector (useful for storage cost) - `title` - document title, only applied with `taskType: RETRIEVAL_DOCUMENT` If you need Vertex authentication or additional embedding models, see the [Vertex provider](/docs/providers/vertex#embedding-models) instead. ### Image Generation Models Imagen models are available through both **Google AI Studio** and **Vertex AI**. Use the `google:image:` prefix: #### Imagen 4 Models (Available in both Google AI Studio and Vertex AI) - `google:image:imagen-4.0-ultra-generate-preview-06-06` - Ultra quality ($0.06/image) - `google:image:imagen-4.0-generate-preview-06-06` - Standard quality ($0.04/image) - `google:image:imagen-4.0-fast-generate-preview-06-06` - Fast generation ($0.02/image) #### Imagen 3 Models (Vertex AI only) - `google:image:imagen-3.0-generate-002` - Imagen 3.0 ($0.04/image) - `google:image:imagen-3.0-generate-001` - Imagen 3.0 ($0.04/image) - `google:image:imagen-3.0-fast-generate-001` - Imagen 3.0 fast ($0.02/image) #### Authentication Options **Option 1: Google AI Studio** (Quick start, limited features) ```bash export GOOGLE_API_KEY=your-api-key ``` - ✅ Simpler setup with API key - ✅ Supports Imagen 4 models - ❌ No support for Imagen 3 models - ❌ No support for `seed` or `addWatermark` parameters **Option 2: Vertex AI** (Full features) ```bash gcloud auth application-default login export GOOGLE_PROJECT_ID=your-project-id ``` - ✅ All Imagen models supported - ✅ All configuration parameters supported - ❌ Requires Google Cloud project with billing The provider automatically selects the appropriate API based on available credentials. Configuration options: ```yaml providers: - google:image:imagen-3.0-generate-002 config: projectId: 'your-project-id' # Or set GOOGLE_PROJECT_ID region: 'us-central1' # Optional, defaults to us-central1 aspectRatio: '16:9' seed: 42 addWatermark: false # Must be false when using seed ``` See the [Google Imagen example](https://github.com/promptfoo/promptfoo/tree/main/examples/google-imagen). ### Gemini Native Image Generation Models Gemini models can generate images natively using the `generateContent` API. Models with `-image` in the name automatically enable image generation: - `google:gemini-3.1-flash-image-preview` - Gemini 3.1 Flash (Nano Banana 2) with native image generation (~$0.067/image at 1K) - `google:gemini-3-pro-image-preview` - Gemini 3 Pro with advanced image generation (~$0.05/image, estimated) - `google:gemini-2.5-flash-image` - Gemini 2.5 Flash with image generation (~$0.04/image) Configuration options: ```yaml providers: - id: google:gemini-3.1-flash-image-preview config: imageAspectRatio: '16:9' # 1:1, 1:4, 1:8, 2:3, 3:2, 3:4, 4:1, 4:3, 4:5, 5:4, 8:1, 9:16, 16:9, 21:9 imageSize: '2K' # 512px (3.1 only), 1K, 2K, 4K temperature: 0.7 ``` Key differences from Imagen: - Uses same namespace as Gemini chat (`google:model-name`) - More aspect ratio options (includes 1:4, 1:8, 2:3, 3:2, 4:1, 4:5, 5:4, 8:1, 21:9) - Resolution control via `imageSize` (`512px`, `1K`, `2K`, `4K`) - `512px` is Gemini 3.1 only - Can return both text and images in the same response - Uses same authentication as Gemini chat models - Supports Google Search grounding via `tools` Google Search grounding lets the model use real-time search results to inform image generation: ```yaml providers: - id: google:gemini-3.1-flash-image-preview config: imageAspectRatio: '16:9' tools: - googleSearch: {} ``` See the [Google Imagen example](https://github.com/promptfoo/promptfoo/tree/main/examples/google-imagen) for Gemini image generation configurations. ### Video Generation Models (Veo) Google's Veo models enable AI-powered video generation from text prompts. Use the `google:video:` prefix with `GOOGLE_API_KEY` / `GEMINI_API_KEY` for Google AI Studio. For explicit Vertex AI routing, use the `vertex:video:` prefix instead. #### Available Models | Model | Description | Duration Support | | --------------------------------------- | ------------------------------------------------- | ---------------- | | `google:video:veo-3.1-generate-preview` | Latest Veo 3.1 model with video extension support | 4, 6, 8 seconds | | `google:video:veo-3.1-fast-preview` | Fast Veo 3.1 model | 4, 6, 8 seconds | | `google:video:veo-3-generate` | Veo 3.0 standard model | 4, 6, 8 seconds | | `google:video:veo-3-fast` | Veo 3.0 fast model | 4, 6, 8 seconds | | `google:video:veo-2-generate` | Veo 2.0 model | 5, 6, 8 seconds | #### Basic Usage ```yaml providers: - id: google:video:veo-3.1-generate-preview config: # Uses GOOGLE_API_KEY / GEMINI_API_KEY by default aspectRatio: '16:9' # or '9:16' resolution: '720p' # or '1080p' durationSeconds: 6 # 4, 6, or 8 for Veo 3.x; 5, 6, or 8 for Veo 2 prompts: - 'Generate a video of {{subject}}' tests: - vars: subject: 'a cat playing with a ball of yarn' ``` :::note `google:video:*` uses Google AI Studio by default and can auto-detect Vertex AI when project-based auth is configured. Existing project-based `google:video:*` configs remain compatible, but `vertex:video:*` is the recommended explicit path for Vertex-only flows like `extendVideoId`. ::: #### Configuration Options | Option | Type | Description | | ------------------ | ------ | ----------------------------------------------------------------------------------------------------------- | | `aspectRatio` | string | Video aspect ratio: `16:9` (default) or `9:16` | | `resolution` | string | Video resolution: `720p` (default) or `1080p` | | `durationSeconds` | number | Video duration: 4, 6, 8 for Veo 3.x; 5, 6, 8 for Veo 2 | | `personGeneration` | string | Person generation mode: `allow_adult` or `dont_allow` | | `negativePrompt` | string | Concepts to avoid in the generated video | | `referenceImages` | array | Up to 3 reference images (file paths or objects, Veo 3.1 only) | | `image` | string | Source image for image-to-video generation | | `lastImage` | string | End frame for interpolation (requires `image`) | | `extendVideoId` | string | Operation ID from a previous Vertex Veo generation (Veo 3.1 only) | | `sourceVideo` | string | Source video input. In Google AI Studio use base64 or `file://`; in Vertex you can also use an operation ID | #### Image-to-Video Generation Generate videos from a starting image: ```yaml providers: - id: google:video:veo-3.1-generate-preview config: image: file://assets/start-frame.jpg aspectRatio: '16:9' durationSeconds: 6 prompts: - 'Animate this image: {{animation_description}}' tests: - vars: animation_description: 'the character slowly turns to face the camera' ``` #### Video Interpolation (First and Last Frame) Generate video that transitions between two images: ```yaml providers: - id: google:video:veo-3.1-generate-preview config: image: file://assets/start.jpg # First frame lastImage: file://assets/end.jpg # Last frame durationSeconds: 6 prompts: - 'Create a smooth transition between these frames' ``` #### Video Extension (Veo 3.1 Only) Extend a previously generated Veo video using its operation ID: ```yaml providers: - id: vertex:video:veo-3.1-generate-preview config: # Use the operation ID from a previous Veo generation extendVideoId: projects/my-project/locations/us-central1/publishers/google/models/veo-3.1-generate-preview/operations/abc123 durationSeconds: 6 prompts: - 'Continue this video with {{continuation}}' tests: - vars: continuation: 'the camera panning to reveal a sunset' ``` :::note `extendVideoId` is a Vertex AI flow and requires an operation ID from a previous Veo generation. For Google AI Studio, pass base64 or a `file://` video via `sourceVideo` instead. Older `google:video:*` configs with project-based auth still work through Vertex auto-detection, but `vertex:video:*` is the clearer form. ::: #### Reference Images Use up to 3 reference images to guide video style (Veo 3.1 only): ```yaml providers: - id: google:video:veo-3.1-generate-preview config: referenceImages: # Simple format: file paths (uses 'asset' reference type) - file://assets/style-ref-1.jpg - file://assets/style-ref-2.jpg aspectRatio: '16:9' durationSeconds: 6 ``` You can also use the object format to specify the reference type: ```yaml referenceImages: - image: file://assets/character.jpg referenceType: asset - image: file://assets/background.jpg referenceType: asset ``` #### Storage Generated videos are stored in promptfoo's blob storage system, which uses content-addressable hashing for deduplication. Videos with identical content share the same storage reference. Use `--no-cache` to force regeneration: ```bash promptfoo eval --no-cache ``` See the [Google Video example](https://github.com/promptfoo/promptfoo/tree/main/examples/google-video) for complete configurations. ### Basic Configuration The provider supports various configuration options that can be used to customize the behavior of the model: ```yaml providers: - id: google:gemini-2.5-pro config: temperature: 0.7 # Controls randomness (0.0 to 1.0) maxOutputTokens: 2048 # Maximum length of response topP: 0.9 # Nucleus sampling topK: 40 # Top-k sampling stopSequences: ['END'] # Stop generation at these sequences ``` ### Thinking Configuration For models that support thinking capabilities, you can configure how the model reasons through problems. #### Gemini 3 Models (thinkingLevel) Gemini 3 models use `thinkingLevel` for more granular control: ```yaml providers: - id: google:gemini-3-flash-preview config: generationConfig: thinkingConfig: thinkingLevel: MEDIUM # MINIMAL, LOW, MEDIUM, or HIGH ``` | Level | Description | | ------- | ---------------------------------------------------------- | | MINIMAL | Fewest tokens. Best for low-complexity tasks (Flash only). | | LOW | Fewer tokens. Suitable for simpler tasks. | | MEDIUM | Balanced approach for moderate complexity (Flash only). | | HIGH | More tokens for deep reasoning. Default. | #### Gemini 2.5 Models (thinkingBudget) Gemini 2.5 models use `thinkingBudget`: ```yaml providers: - id: google:gemini-2.5-flash config: generationConfig: temperature: 0.7 maxOutputTokens: 2048 thinkingConfig: thinkingBudget: 1024 # Controls tokens allocated for thinking process ``` The thinking configuration allows the model to show its reasoning process before providing the final answer, which can be helpful for complex tasks that require step-by-step thinking. **Note:** You cannot use both `thinkingLevel` and `thinkingBudget` in the same request. You can also specify a response schema for structured output: ```yaml providers: - id: google:gemini-2.5-pro config: generationConfig: response_mime_type: application/json response_schema: type: object properties: foo: type: string ``` For multimodal inputs (images and video), the provider supports: - Images: PNG, JPEG, WEBP, HEIC, HEIF formats (max 3,600 files) - Videos: MP4, MPEG, MOV, AVI, FLV, MPG, WEBM, WMV, 3GPP formats (up to ~1 hour) When using images, place them on separate lines in your prompt. The `file://` prefix automatically handles loading and encoding: ```yaml prompts: | {{imageFile}} Caption this image. providers: - id: google:gemini-2.5-flash tests: - vars: imageFile: file://assets/red-panda.jpg ``` ### Safety Settings Safety settings can be configured to control content filtering: ```yaml providers: - id: google:gemini-2.5-pro config: safetySettings: - category: HARM_CATEGORY_DANGEROUS_CONTENT threshold: BLOCK_ONLY_HIGH # or other thresholds ``` ### System Instructions Configure system-level instructions for the model: ```yaml providers: - id: google:gemini-2.5-pro config: # Direct text systemInstruction: 'You are a helpful assistant' # Or load from file systemInstruction: file://system-instruction.txt ``` System instructions support Nunjucks templating and can be loaded from external files for better organization and reusability. ### Role Mapping Configuration Gemini models require specific role names in chat messages. By default, Promptfoo uses the `model` role for compatibility with newer Gemini versions (2.5+). For older Gemini versions that expect the `assistant` role, you can disable this: ```yaml providers: # Default behavior - maps 'assistant' to 'model' (for Gemini 2.5+) - id: google:gemini-2.5-flash config: temperature: 0.7 # For older Gemini versions - preserve 'assistant' role - id: google:gemini-2.5-pro config: useAssistantRole: true # Preserves 'assistant' role without mapping temperature: 0.7 ``` For more details on capabilities and configuration options, see the [Gemini API documentation](https://ai.google.dev/docs). ## Model Examples ### Gemini 3 Flash Preview Gemini 3.0 Flash with frontier intelligence, Pro-grade reasoning, and thinking capabilities: ```yaml providers: - id: google:gemini-3-flash-preview config: temperature: 0.7 maxOutputTokens: 4096 generationConfig: thinkingConfig: thinkingLevel: MEDIUM # MINIMAL, LOW, MEDIUM, or HIGH ``` Thinking levels for Gemini 3 Flash: MINIMAL (fastest), LOW, MEDIUM (balanced), HIGH (most thorough). ### Gemini 3.1 Pro Preview Gemini 3.1 Pro with improved reasoning and agentic capabilities: ```yaml providers: - id: google:gemini-3.1-pro-preview config: temperature: 0.7 maxOutputTokens: 4096 generationConfig: thinkingConfig: thinkingLevel: HIGH # LOW or HIGH (Pro only supports these two levels) ``` Thinking levels for Gemini 3.1 Pro: LOW (faster, simpler tasks), HIGH (deep reasoning, default). ### Gemini 2.5 Pro Gemini 2.5 Pro model for complex reasoning, coding, and multimodal understanding: ```yaml providers: - id: google:gemini-2.5-pro config: temperature: 0.7 maxOutputTokens: 4096 topP: 0.9 topK: 40 generationConfig: thinkingConfig: thinkingBudget: 2048 # Enhanced thinking for complex tasks ``` ### Gemini 2.5 Flash Gemini 2.5 Flash model with enhanced reasoning and thinking capabilities: ```yaml providers: - id: google:gemini-2.5-flash config: temperature: 0.7 maxOutputTokens: 2048 topP: 0.9 topK: 40 generationConfig: thinkingConfig: thinkingBudget: 1024 # Fast model with thinking capabilities ``` ### Gemini 2.5 Flash-Lite Cost-efficient and fast model for high-volume, latency-sensitive tasks: ```yaml providers: - id: google:gemini-2.5-flash-lite config: temperature: 0.7 maxOutputTokens: 1024 topP: 0.9 topK: 40 generationConfig: thinkingConfig: thinkingBudget: 512 # Optimized for speed and cost efficiency ``` ### Gemini 2.0 Flash Best for fast, efficient responses and general tasks: ```yaml providers: - id: google:gemini-2.0-flash config: temperature: 0.7 maxOutputTokens: 2048 topP: 0.9 topK: 40 ``` ## Advanced Features ### Overriding Providers You can override both the text generation and embedding providers in your configuration. Because of how model-graded evals are implemented, **the text generation model must support chat-formatted prompts**. You can override providers in several ways: 1. For all test cases using `defaultTest`: ```yaml title="promptfooconfig.yaml" defaultTest: options: provider: # Override text generation provider text: id: google:gemini-2.0-flash config: temperature: 0.7 # Override embedding provider for similarity comparisons embedding: id: google:embedding:gemini-embedding-001 ``` 2. For individual assertions: ```yaml assert: - type: similar value: Expected response threshold: 0.8 provider: id: google:embedding:gemini-embedding-001 ``` 3. For specific tests: ```yaml tests: - vars: puzzle: What is 2 + 2? options: provider: text: id: google:gemini-2.0-flash embedding: id: google:embedding:gemini-embedding-001 assert: - type: similar value: The answer is 4 ``` ### Tool Calling Google models support tool calling via the `tools` and `tool_config` config fields. The model returns tool calls in its response for your application to execute. ```yaml providers: - id: google:gemini-2.5-pro config: tools: function_declarations: - name: 'get_weather' description: 'Get current weather for a location' parameters: type: 'object' properties: location: type: 'string' description: 'City name or coordinates' units: type: 'string' enum: ['celsius', 'fahrenheit'] required: ['location'] tool_config: function_calling_config: mode: 'auto' # or 'none' to disable ``` For practical examples of function calling with Google AI models, see the [google-vertex-tools example](https://github.com/promptfoo/promptfoo/tree/main/examples/google-vertex-tools) which demonstrates both basic tool declarations and callback execution patterns that work with Google AI Studio models. ### Structured Output You can constrain the model to output structured JSON responses in two ways: #### 1. Using Response Schema Configuration ```yaml providers: - id: google:gemini-2.5-pro config: generationConfig: response_mime_type: 'application/json' response_schema: type: 'object' properties: title: type: 'string' summary: type: 'string' tags: type: 'array' items: type: 'string' required: ['title', 'summary'] ``` #### 2. Using Response Schema File ```yaml providers: - id: google:gemini-2.5-pro config: # Can be inline schema or file path responseSchema: 'file://path/to/schema.json' ``` For more details, see the [Gemini API documentation](https://ai.google.dev/docs). ### Search Grounding Search grounding allows Gemini models to access the internet for up-to-date information, enhancing responses about recent events and real-time data. #### Basic Usage To enable Search grounding: ```yaml providers: - id: google:gemini-2.5-flash config: tools: - googleSearch: {} # or google_search: {} ``` #### Combining with Other Features You can combine Search grounding with thinking capabilities for better reasoning: ```yaml providers: - id: google:gemini-2.5-pro config: generationConfig: thinkingConfig: thinkingBudget: 1024 tools: - googleSearch: {} ``` #### Supported Models :::info Search grounding works with most recent Gemini models including: - Gemini 2.5 Flash and Pro models - Gemini 2.0 Flash and Pro models - Gemini 1.5 Flash and Pro models ::: #### Use Cases Search grounding is particularly valuable for: - Current events and news - Recent developments - Stock prices and market data - Sports results - Technical documentation updates #### Working with Response Metadata When using Search grounding, the API response includes additional metadata: - `groundingMetadata` - Contains information about search results used - `groundingChunks` - Web sources that informed the response - `webSearchQueries` - Queries used to retrieve information #### Limitations and Requirements - Search results may vary by region and time - Results may be subject to Google Search rate limits - Search grounding may incur additional costs beyond normal API usage - Search will only be performed when the model determines it's necessary - **Important**: Per Google's requirements, applications using Search grounding must display Google Search Suggestions included in the API response metadata For more details, see the [Google AI Studio documentation on Grounding with Google Search](https://ai.google.dev/docs/gemini_api/grounding). ### Code Execution Code execution allows Gemini models to write and execute Python code to solve computational problems, perform calculations, and generate data visualizations. #### Basic Usage To enable code execution: ```yaml providers: - id: google:gemini-2.5-flash config: tools: - codeExecution: {} ``` #### Example Use Cases Code execution is particularly valuable for: - Mathematical computations and calculations - Data analysis and visualization For more details, see the [Google AI Studio documentation on Code Execution](https://ai.google.dev/gemini-api/docs/code-execution). ### URL Context URL context allows Gemini models to extract and analyze content from web URLs, enabling them to understand and work with information from specific web pages. #### Basic Usage To enable URL context: ```yaml providers: - id: google:gemini-2.5-flash config: tools: - urlContext: {} ``` #### Example Use Cases URL context is particularly valuable for: - Analyzing specific web page content - Extracting information from documentation - Comparing information across multiple URLs For more details, see the [Google AI Studio documentation on URL Context](https://ai.google.dev/gemini-api/docs/url-context). For complete working examples of the search grounding, code execution, and url context features, see the [google-aistudio-tools examples](https://github.com/promptfoo/promptfoo/tree/main/examples/google-aistudio-tools). ## Google Live API Promptfoo now supports Google's WebSocket-based Live API, which enables low-latency bidirectional voice and video interactions with Gemini models. This API provides real-time interactive capabilities beyond what's available in the standard REST API. ### Using the Live Provider Access the Google Live API by specifying the model with the 'live' service type: ```yaml providers: - id: 'google:live:gemini-3.1-flash-live-preview' config: generationConfig: response_modalities: ['audio'] outputAudioTranscription: {} timeoutMs: 10000 ``` ### Key Features - **Real-time bidirectional communication**: Uses WebSockets for faster responses - **Multimodal capabilities**: Can process text, audio, and video inputs - **Built-in tools**: Supports function calling and Google Search integration - **Low-latency interactions**: Optimized for conversational applications - **Session memory**: The model retains context throughout the session ### Function Calling Example The Google Live API supports function calling, allowing you to define tools that the model can use: ```yaml providers: - id: 'google:live:gemini-3.1-flash-live-preview' config: tools: file://tools.json generationConfig: response_modalities: ['audio'] outputAudioTranscription: {} timeoutMs: 10000 ``` Where `tools.json` contains function declarations and built-in tools: ```json [ { "functionDeclarations": [ { "name": "get_weather", "description": "Get current weather information for a city", "parameters": { "type": "OBJECT", "properties": { "city": { "type": "STRING", "description": "The name of the city to get weather for" } }, "required": ["city"] } } ] }, { "googleSearch": {} } ] ``` ### Built-in Tools The current Google Live API model supports built-in Google Search: 1. **Google Search**: Perform real-time web searches ```json { "googleSearch": {} } ``` ### Audio Generation Evaluate audio generation with the Google Live provider: 1. Basic audio generation: ```yaml providers: - id: 'google:live:gemini-3.1-flash-live-preview' config: generationConfig: response_modalities: ['audio'] outputAudioTranscription: {} # Enable transcription speechConfig: voiceConfig: prebuiltVoiceConfig: voiceName: 'Charon' timeoutMs: 30000 ``` 2. Specifying additional options, such as enabling affective dialog on the older 2.5 Live model: ```yaml providers: - id: 'google:live:gemini-2.5-flash-native-audio-preview-12-2025' config: apiVersion: 'v1alpha' # Required for affective dialog generationConfig: response_modalities: ['audio'] enableAffectiveDialog: true ``` Other configuration options are available, such as setting proactive audio, setting the language code, and more. Read more about sending and receiving audio for Gemini in the [Google Live API documentation](https://ai.google.dev/gemini-api/docs/live-guide#send-receive-audio). ### Getting Started Try the examples: ```sh # Basic text-only example promptfoo init --example google-live # Function calling and tools example promptfoo init --example google-live # Audio generation example promptfoo init --example google-live-audio ``` ### Limitations - Sessions are limited to 15 minutes for audio or 2 minutes of audio and video - Token counting is not supported - Rate limits of 3 concurrent sessions per API key apply - Maximum of 4M tokens per minute For more details, see the [Google Live API documentation](https://ai.google.dev/gemini-api/docs/live). ## See Also - [Vertex AI Provider](/docs/providers/vertex) - For enterprise features and advanced Google AI capabilities - [Google Examples](https://github.com/promptfoo/promptfoo/tree/main/examples) - Browse working examples for Google AI Studio - [Gemini API Documentation](https://ai.google.dev/docs) - Official Google AI documentation - [Configuration Reference](/docs/configuration/reference) - Complete configuration options for promptfoo