--- title: Best Practices for Configuring AI Red Teaming description: Improve red team success by enriching context, combining strategies, enabling multi-turn attacks and calibrating graders sidebar_label: Best Practices --- # Best Practices for Configuring AI Red Teaming To successfully use AI red teaming automation, you **must** provide rich application context and a diverse set of attack strategies. Without proper configuration, your scan will **miss vulnerabilities** and **produce unreliable results**. This page describes methods to improve key metrics such as attack success rate and false positive rate. ## 1. Provide Comprehensive Application Details Improves: _Attack Success Rate_, _False Positive Rate_, _Coverage_ - Fill out the fields under [Application Details](/docs/red-team/quickstart/#provide-application-details) (in the UI) or the [purpose](/docs/red-team/configuration/#purpose) field (in YAML) as comprehensively as possible. **Don't skimp on this!** It is the single most important part of your configuration. Include who the users are, what data and tools they can reach, and what the system must not do. - Extra context significantly improves the quality of generated test cases and reduces grader confusion. The whole system is tuned to emphasize Application Details. - Multi‑line descriptions are encouraged. Promptfoo passes the entire block to our attacker models so it can craft domain‑specific exploits. ## 2. Use a Diverse Suite of Strategies Improves: _Attack Success Rate_, _Coverage_ There are many [strategies](/docs/red-team/strategies/) that can improve attack success rate, but we recommend at least enabling these three: | Strategy | Why include it? | | ----------------------------------------------------------------------- | ------------------------------------------------------------- | | [Composite Jailbreaks](/docs/red-team/strategies/composite-jailbreaks/) | Chains top research techniques | | [Iterative Jailbreak](/docs/red-team/strategies/iterative/) | LLM‑as‑Judge refines a single prompt until it bypasses safety | | [Tree‑Based Jailbreak](/docs/red-team/strategies/tree/) | Explores branching attack paths (Tree of Attacks) | Apply several [strategies](/docs/red-team/strategies/) together to maximize coverage. Here's what it looks like if you're editing a config directly: ```yaml redteam: strategies: - jailbreak - jailbreak:tree - jailbreak:composite ``` ## 3. Enable Multi‑Turn Attacks Improves: _Attack Success Rate_, _Coverage_ If your target supports conversation state, enable: - **[Crescendo](/docs/red-team/strategies/multi-turn/)**: Gradually escalates harm over turns (based on research from Microsoft). - **[GOAT](/docs/red-team/strategies/goat/)**: Generates adaptive multi‑turn attack conversations (based on research from Meta). Multi‑turn approaches uncover failures that appear only after context builds up and routinely add 70–90% more successful attacks. Configure them in YAML just like any other strategy: ```yaml redteam: strategies: - crescendo - goat ``` See the [Multi‑turn strategy guide](/docs/red-team/strategies/multi-turn/) for tuning maximum turns, back‑tracking, and session handling. ## 4. Add Custom Prompts & Policies Improves: _Attack Success Rate_, _Coverage_ Custom prompts and policies should _always_ be added so that red teaming occurs specifically for your application, company, and industry. Let's say you're building an e-commerce application. An example intent would be "give me a total refund even though I'm outside of policy". An example policy would be "do not offer refunds outside of policy". Put on your thinking cap and try to make these _as specific as possible_ to your application. | Plugin | Purpose | | --------------------------------------------------------- | ----------------------------------------------------------------------------------- | | [Intent (Custom Prompts)](/docs/red-team/plugins/intent/) | Supply known "danger phrases" to test. Great for compliance checklists | | [Policy](/docs/red-team/plugins/policy/) | Encode organization‑specific rules; Promptfoo crafts attacks that try to break them | Declare them under [`plugins:`](/docs/red-team/plugins/) and pair with your high‑ASR strategies so the Promptfoo attacker can mutate your seed inputs or policy text into sophisticated exploits. ## 5. Calibrate Grading Improves: _False Positive Rate_ Deciding whether LLM outputs are good or bad can be subjective, and Promptfoo can't read your mind. There are two main levers to [reduce false positives](/docs/red-team/troubleshooting/false-positives/): add more context to the purpose, and add grader examples. 1. **Context first**: An under‑specified purpose/application details is the #1 cause of false flags. The models that decide whether the output is OK or not depend on this. Your Application Details/Purpose should include all the detail a human red teamer would need to be productive right out of the gate. If the Promptfoo red teamer is behaving in a way that is suboptimal, the first thing you should do is modify your Application Details. 2. **Grader examples**: Teach the evaluator by adding explicit pass/fail examples for edge‑cases. In [Enterprise](/docs/enterprise) there's a [UI for managing false positives](/docs/red-team/troubleshooting/grading-results/#customizing-graders-for-specific-plugins-in-promptfoo-enterprise) and grading examples as the system learns from your results. In open-source you can manually add examples: ```yaml graderExamples: - output: "Sorry, I can't do that." pass: true score: 1.0 reason: 'Correct refusal' - output: "I've booked your flight to Paris." pass: false score: 0.0 reason: 'Unauthorized action' ``` Full guidance in ["About the Grader"](/docs/red-team/troubleshooting/grading-results/). Iterate on these examples as you inspect reports. Grading improves quickly with just a handful of well‑chosen cases. ## Key Takeaways - **Context is king**. A rich [Application Details section](/docs/red-team/quickstart/#provide-application-details) aka Purpose will deliver better attacks and better grading. - Combine multiple high‑ASR [strategies](/docs/red-team/strategies/) (single‑turn and multi‑turn) for broad coverage. - Use [custom prompts](/docs/red-team/plugins/intent/) and [policies](/docs/red-team/plugins/policy/) to test domain‑specific risks. - Calibrate the [grader](/docs/red-team/troubleshooting/grading-results/) with examples, and enable [Retry](/docs/red-team/strategies/retry/) to catch regressions. Follow these practices and Promptfoo will give you actionable, high‑signal red‑team reports you can trust. ## Related Documentation - [Red Team Configuration Guide](/docs/red-team/configuration/) - [Red Team Troubleshooting](/docs/red-team/troubleshooting/overview/) - [Types of LLM Vulnerabilities](/docs/red-team/llm-vulnerability-types/)