# xai/chat (xAI Grok Models Evaluation) This example demonstrates how to evaluate xAI's Grok models across their main capabilities: text generation with reasoning, image creation, and server-side search tools. You can run this example with: ```bash npx promptfoo@latest init --example xai/chat cd xai/chat ``` ## Environment Variables This example requires the following environment variable: - `XAI_API_KEY` - Your xAI API key. You can obtain this from the [xAI Console](https://console.x.ai/) ## Quick Start ```bash # Set your API key export XAI_API_KEY=your_api_key_here # Run the main evaluation promptfoo eval # View results in the web interface promptfoo view ``` ## What's Tested This example includes configurations to test different Grok capabilities: - **Text Generation** (`promptfooconfig.yaml`) - Mathematical reasoning with Grok 4.3, Grok 4.20, Grok 4.1 Fast, Grok 4 Fast, Grok 4, and Grok 3 models - **Image Generation** (`promptfooconfig.images.yaml`) - Artistic image creation using Grok's image models - **Search Tools** (`promptfooconfig.search.yaml`) - Real-time web and X search using the Responses API - **Agent Tools (Responses API)** (`promptfooconfig.responses.yaml`) - Autonomous web and X search using Agent Tools - **Search Demo** (`promptfooconfig.promptfoo-search.yaml`) - Responses API search with assertions example ## Run Individual Tests ```bash # Text generation with mathematical reasoning promptfoo eval -c promptfooconfig.yaml # Image generation with artistic styles promptfoo eval -c promptfooconfig.images.yaml # Search tools with web and X sources promptfoo eval -c promptfooconfig.search.yaml # Agent Tools with Responses API (recommended) promptfoo eval -c promptfooconfig.responses.yaml # Search demo with assertions promptfoo eval -c promptfooconfig.promptfoo-search.yaml ``` ## Featured Models ### Grok 4.3 The recommended starting point for general text workflows: - `xai:grok-4.3` - General-purpose reasoning model - `xai:responses:grok-4.3` - Recommended form for server-side tools ### Grok 4.20 - `xai:grok-4.20-reasoning` - Reasoning model - `xai:grok-4.20-non-reasoning` - Non-reasoning model - `xai:grok-4.20-multi-agent` - Multi-agent variant ### Grok 4.1 Fast A frontier model optimized for agentic tool calling with a 2M context window: - `xai:grok-4-1-fast-reasoning` - Maximum intelligence with reasoning - `xai:grok-4-1-fast-non-reasoning` - Fast responses without reasoning ### Grok 4 Fast Fast reasoning models with 2M context: - `xai:grok-4-fast-reasoning` - Reasoning variant - `xai:grok-4-fast-non-reasoning` - Non-reasoning variant ### Grok 4 Flagship reasoning model: - `xai:grok-4` - Full reasoning capabilities ### Agent Tools (Responses API) Enable autonomous tool execution via the Responses API: ```yaml providers: - id: xai:responses:grok-4.3 config: tools: - type: web_search - type: x_search - type: code_interpreter ``` ### Search Tools Enable real-time search via the Responses API: ```yaml providers: - id: xai:responses:grok-4.3 config: tools: - type: web_search - type: x_search ``` ## Expected Results - **Text Generation**: Grok will provide step-by-step mathematical solutions with clear reasoning - **Image Generation**: Generated images in the requested artistic styles - **Search Tools**: Current information from web and X with source citations