--- sidebar_label: llamafile description: 'Deploy LLMs as portable single-file executables using llamafile for offline testing with OpenAI-compatible API endpoints' --- # llamafile Llamafile has an [OpenAI-compatible HTTP endpoint](https://github.com/Mozilla-Ocho/llamafile?tab=readme-ov-file#json-api-quickstart), so you can override the [OpenAI provider](/docs/providers/openai/) to talk to your llamafile server. In order to use llamafile in your eval, set the `apiBaseUrl` variable to `http://localhost:8080` (or wherever you're hosting llamafile). Here's an example config that uses LLaMA_CPP for text completions: ```yaml providers: - id: openai:chat:LLaMA_CPP config: apiBaseUrl: http://localhost:8080/v1 ``` If desired, you can instead use the `OPENAI_BASE_URL` environment variable instead of the `apiBaseUrl` config.