---
sidebar_label: Llama.cpp
description: "Execute quantized LLMs efficiently on CPUs using llama.cpp's optimized inference engine for resource-constrained deployments"
---

# Llama.cpp

The `llama` provider is compatible with the HTTP server bundled with [llama.cpp](https://github.com/ggerganov/llama.cpp). This allows you to leverage the power of `llama.cpp` models within Promptfoo.

## Configuration

To use the `llama` provider, specify `llama` as the provider in your `promptfooconfig.yaml` file.

Supported environment variables:

- `LLAMA_BASE_URL` - Scheme, hostname, and port (defaults to `http://localhost:8080`)

For a detailed example of how to use Promptfoo with `llama.cpp`, including configuration and setup, refer to the [example on GitHub](https://github.com/promptfoo/promptfoo/tree/main/examples/provider-llama-cpp).