--- sidebar_label: n8n title: Using Promptfoo in n8n Workflows description: Learn how to integrate Promptfoo's LLM evaluation into your n8n workflows for automated testing, security and quality gates, and result sharing --- # Using Promptfoo in n8n Workflows This guide shows how to run Promptfoo evaluations from an **n8n** workflow so you can: - schedule nightly or ad‑hoc LLM tests, - gate downstream steps (Slack/Teams alerts, merge approvals, etc.) on pass‑rates, and - publish rich results links generated by Promptfoo. ## Prerequisites | What | Why | | ------------------------------------------------------------------------------------ | ----------------------------------------------------- | | **Self‑hosted n8n ≥ v1** (Docker or bare‑metal) | Gives access to the “Execute Command” node. | | **Promptfoo CLI** available in the container/host | Needed to run `promptfoo eval`. | | (Optional) **LLM provider API keys** set as environment variables or n8n credentials | Example: `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, … | | (Optional) **Slack / email / GitHub nodes** in the same workflow | For notifications or comments once the eval finishes. | ### Shipping a custom Docker image (recommended) The easiest way is to bake Promptfoo into your n8n image so every workflow run already has the CLI: ```dockerfile # Dockerfile FROM n8nio/n8n:latest # or a fixed tag USER root # gain perms to install packages RUN npm install -g promptfoo # installs CLI system‑wide USER node # drop back to non‑root ``` Update **`docker‑compose.yml`**: ```yaml services: n8n: build: . env_file: .env # where your OPENAI_API_KEY lives volumes: - ./data:/data # prompts & configs live here ``` If you prefer not to rebuild the image you _can_ install Promptfoo on the fly inside the **Execute Command** node, but that adds 10‑15 s to every execution. ## Basic “Run & Alert” workflow Below is the minimal pattern most teams start with: | # | Node | Purpose | | --- | ----------------------------- | ----------------------------------------------------------------- | | 1 | **Trigger** (Cron or Webhook) | Decide _when_ to evaluate (nightly, on Git push webhook …). | | 2 | **Execute Command** | Runs Promptfoo and emits raw stdout / stderr. | | 3 | **Code / Set** node | Parses the resulting JSON, extracts pass/fail counts & share‑URL. | | 4 | **IF** node | Branches on “failures > 0”. | | 5 | **Slack / Email / GitHub** | Sends alert or PR comment when the gate fails. | ### Execute Command node configuration ```sh promptfoo eval \ -c /data/promptfooconfig.yaml \ --prompts "/data/prompts/**/*.json" \ --output /tmp/pf-results.json \ --share --fail-on-error cat /tmp/pf-results.json ``` Set the working directory to `/data` (mount it with Docker volume) and set it to execute once (one run per trigger). The node writes a machine‑readable results file **and** prints it to stdout, so the next node can simply `JSON.parse($json["stdout"])`. :::info The **Execute Command** node that we rely on is only available in **self‑hosted** n8n. n8n Cloud does **not** expose it yet. ::: ### Sample “Parse & alert” snippet (Code node, TypeScript) ```ts // Input: raw JSON string from previous node const output = JSON.parse(items[0].json.stdout as string); const { successes, failures } = output.results.stats; items[0].json.passRate = successes / (successes + failures); items[0].json.failures = failures; items[0].json.shareUrl = output.shareableUrl; return items; ``` An **IF** node can then route execution: - **failures = 0** → take _green_ path (maybe just archive the results). - **failures > 0** → post to Slack or comment on the pull request. ## Evaluating n8n AI Agent prompts and outputs If your goal is to test the **prompt inside an n8n AI Agent / OpenAI node** (not just run Promptfoo from a workflow), treat the n8n node like any other app contract: 1. Put the agent prompt in a file, 2. Map incoming n8n fields to `tests.vars`, and 3. Assert on the exact JSON or tool-call shape that downstream n8n nodes expect. This works well when you want to regression-test an agent before wiring it into a larger workflow. ### Validate JSON that downstream n8n nodes consume If your agent is supposed to emit structured data for a **Set**, **Code**, **Switch**, or **HTTP Request** node, validate the payload directly. ```yaml title="promptfooconfig.yaml" prompts: - file://./prompts/n8n-support-router.txt providers: - openai:gpt-5-mini tests: - vars: customer_message: 'Customer wants to cancel order #4815 and asks for a refund' assert: - type: contains-json value: type: object required: [route, priority, reply] properties: route: type: string enum: [billing, support, sales] priority: type: string enum: [low, medium, high] reply: type: string ``` Use `contains-json` when the model may wrap JSON in prose or a markdown code block. If your node must return **only** JSON, use [`is-json`](/docs/guides/evaluate-json) instead. ### Validate tool calls for agent workflows If your n8n setup uses an OpenAI-compatible agent that should call tools before continuing, validate that Promptfoo sees a real tool call and that it matches your schema. ```yaml title="promptfooconfig.yaml" prompts: - file://./prompts/n8n-calendar-agent.txt providers: - id: openai:gpt-5-mini config: tools: file://./tools/calendar-tools.yaml tests: - vars: user_request: "Move tomorrow's standup to 3pm and notify the team" assert: - type: finish-reason value: tool_calls - type: is-valid-openai-tools-call ``` That pattern is especially useful when your n8n workflow branches on whether the LLM produced a tool invocation versus a final answer. ### Useful building blocks - [`/docs/configuration/tools`](/docs/configuration/tools) for defining tool schemas - [`/docs/guides/evaluate-json`](/docs/guides/evaluate-json) for JSON and schema assertions - [`examples/openai-tools-call`](https://github.com/promptfoo/promptfoo/tree/main/examples/openai-tools-call) for a concrete OpenAI tool-calling config - [`examples/eval-tool-use`](https://github.com/promptfoo/promptfoo/tree/main/examples/eval-tool-use) for finish-reason and tool-use checks across providers ## Advanced patterns ### Run different configs in parallel Make the first **Execute Command** node loop over an array of model IDs or config files and push each run as a separate item. Downstream nodes will automatically fan‑out and handle each result independently. ### Version‑controlled prompts Mount your prompts directory and config file into the container at `/data`. When you commit new prompts to Git, your CI/CD system can call the **n8n REST API** or a **Webhook trigger** to re‑evaluate immediately. ### Auto‑fail the whole workflow If you run n8n **headless** via `n8n start --tunnel`, you can call this workflow from CI pipelines (GitHub Actions, GitLab, …) with the [CLI `n8n execute` command](https://docs.n8n.io/hosting/cli-commands/) and then check the HTTP response code; returning `exit 1` from the Execute Command node will propagate the failure. ## Security & best practices - **Keep API keys secret** – store them in the n8n credential store or inject as environment variables from Docker secrets, not hard‑coded in workflows. - **Resource usage** – Promptfoo supports caching via `PROMPTFOO_CACHE_PATH`; mount that directory to persist across runs. - **Timeouts** – wrap `promptfoo eval` with `timeout --signal=SIGKILL 15m …` (Linux) if you need hard execution limits. - **Logging** – route the `stderr` field of Execute Command to a dedicated log channel so you don’t miss stack traces. ## Troubleshooting | Symptom | Likely cause / fix | | ------------------------------------------- | ----------------------------------------------------------------------------------------------------- | | **`Execute Command node not available`** | You’re on n8n Cloud; switch to self‑hosted. | | **`promptfoo: command not found`** | Promptfoo not installed inside the container. Rebuild your Docker image or add an install step. | | **Run fails with `ENOENT` on config paths** | Make sure the prompts/config volume is mounted at the same path you reference in the command. | | **Large evals time‑out** | Increase the node’s “Timeout (s)” setting _or_ chunk your test cases and iterate inside the workflow. | ## Next steps 1. Combine Promptfoo with the **n8n AI Transform** node to chain evaluations into multi‑step RAG pipelines. 2. Use **n8n Insights** (self‑hosted EE) to monitor historical pass‑rates and surface regressions. 3. Check out the other [CI integrations](/docs/integrations/ci-cd) ([GitHub Actions](/docs/integrations/github-action), [CircleCI](/docs/integrations/circle-ci), etc) for inspiration. Happy automating!