# provider-replicate/llama4-scout (Replicate Llama 4 Scout)

You can run this example with:

```bash
npx promptfoo@latest init --example provider-replicate/llama4-scout
cd provider-replicate/llama4-scout
```

This example demonstrates how to use Replicate to run the new **Llama 4 Scout** model, a cutting-edge 17 billion parameter model with 16 experts using mixture-of-experts architecture.

## About Llama 4 Scout

[Llama 4 Scout](https://replicate.com/meta/llama-4-scout-instruct) is part of the Llama 4 collection of natively multimodal AI models. Key features:

- **17 billion parameters** with **16 experts**
- **Mixture-of-experts architecture** for enhanced performance
- **Natively multimodal** - enables text and multimodal experiences
- **Industry-leading performance** in text and image understanding

## Environment Variables

This example requires the following environment variable:

- `REPLICATE_API_TOKEN` - Your Replicate API key (get one at https://replicate.com/account/api-tokens)

You can set this in a `.env` file or directly in your environment:

```bash
export REPLICATE_API_TOKEN=your_api_token_here
```

## What This Example Does

This example:

- Tests the Llama 4 Scout model on various analytical and creative tasks
- Demonstrates the model's advanced reasoning capabilities
- Compares Llama 4 Scout with Llama 3 to show improvements
- Shows how to configure Replicate model parameters for optimal results

## Running the Example

1. Set your Replicate API token (see above)
2. Run the evaluation:

```bash
promptfoo eval
```

3. View the results:

```bash
promptfoo view
```

## Model Configuration

The example demonstrates key Replicate configuration options for Llama 4:

- `temperature`: Controls randomness (0.0 = deterministic, 1.0 = very random)
- `max_tokens`: Maximum number of tokens to generate
- `top_p`: Nucleus sampling threshold for token selection

## Test Cases

The example includes tests for:

- **AI and mixture-of-experts architecture** - Testing the model's self-awareness
- **Multimodal AI** - Exploring the model's understanding of multimodal capabilities
- **Quantum computing** - Complex technical topics
- **Climate solutions** - Practical problem-solving
- **Creative writing** - Narrative and storytelling abilities

## Customizing the Example

You can modify this example to:

- Test Llama 4 Maverick (128 experts) when available
- Add image understanding tests (when multimodal features are enabled)
- Compare against other state-of-the-art models
- Explore the mixture-of-experts architecture's impact on different tasks

## Notes

- Llama 4 Scout uses a mixture-of-experts approach for efficient computation
- The model excels at both analytical and creative tasks
- Response quality benefits from the 16-expert architecture
- Part of the Llama 4 ecosystem with multimodal capabilities