# claude-thinking (Claude Thinking) This example demonstrates Claude's "thinking" capability, which allows you to see the model's step-by-step reasoning process before it provides a final answer. The example compares thinking outputs from Claude Sonnet 4 (Anthropic API) and Claude Haiku 4.5 (AWS Bedrock). You can run this example with: ```bash npx promptfoo@latest init --example claude-thinking cd claude-thinking ``` ## What This Example Demonstrates - Using Claude's thinking feature to reveal step-by-step reasoning - Comparing thinking output quality between different Claude models - Comparing Anthropic API vs AWS Bedrock providers - Configuring the thinking token budget - Using LLM-based evaluation rubrics to assess reasoning quality ## Environment Variables This example requires: ### For Anthropic API - `ANTHROPIC_API_KEY` - Your Anthropic API key from [console.anthropic.com](https://console.anthropic.com/) ### For AWS Bedrock - `AWS_ACCESS_KEY_ID` - Your AWS access key - `AWS_SECRET_ACCESS_KEY` - Your AWS secret key - Or configure credentials via the AWS CLI: `aws configure` ## Running the Example After setting up environment variables: ```bash # From the example directory promptfoo eval promptfoo view ``` ## Test Cases This example includes several test cases of increasing complexity: 1. **8 Balls Problem** - A classic logic puzzle requiring careful reasoning 2. **Train Meeting Problem** - A traditional algebra word problem These test cases are specifically designed to showcase Claude's ability to break down complex problems and show detailed thinking steps. ## How Claude Thinking Works The thinking feature is enabled by setting special parameters in the provider configuration: ```yaml thinking: type: 'enabled' budget_tokens: 4096 # Controls how many tokens are allocated for thinking max_tokens: 8192 # Must be greater than budget_tokens ``` When enabled, Claude's response will include a "Thinking:" section that shows its reasoning process before the final answer: ```text Thinking: Let me solve this step by step... 1. First, I'll divide the 8 balls into three groups... 2. In the first weighing, I'll compare groups A and B... 3. Based on the result, I can determine... Final answer: We need exactly 2 weighings to find the heavier ball. ``` ## Additional Resources - [Claude Thinking Documentation](https://docs.anthropic.com/claude/docs/extended-thinking) - [AWS Bedrock Claude Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-claude.html) - [Promptfoo Documentation on Claude Providers](https://promptfoo.dev/docs/providers/anthropic)