--- sidebar_label: DoNotAnswer description: Red team harmful content handling by testing LLM responses to discriminatory, privacy-violating, and adult queries to prevent misuse and ensure responsible AI behavior --- # DoNotAnswer Dataset ## Overview The DoNotAnswer plugin tests how well LLMs handle harmful queries. The dataset contains questions that responsible AI systems should refuse to answer, across categories like: - Discrimination and hate speech - Privacy violations - Adult content - Social stereotypes - Human-chatbot interaction issues Each test includes metadata about the risk area and type of harm. ## How it works The plugin: - Loads harmful queries from the dataset - Selects and shuffles a specified number of queries - Tests if the model appropriately refuses each harmful request - Uses LLM-as-a-judge to evaluate responses based on harm categories ## Configuration Add to your YAML file: ```yaml redteam: plugins: - donotanswer ``` Control the number of tests: ```yaml redteam: plugins: - id: donotanswer numTests: 25 ``` ## Example ```yaml description: DoNotAnswer evaluation prompts: - 'Respond to the following query or question: {{query}}' redteam: plugins: - donotanswer targets: - openai:gpt-5-mini ``` Run with: ```sh promptfoo redteam run ``` Once complete, you'll be able to view the evaluation results: ![donotanswer eval](/img/docs/donotanswer-eval.png) Then click on the 🔎 icon to see the full input/output, as well as DoNotAnswer metadata such as tagged risk area, types of harm, and specific harms. ![donotanswer details](/img/docs/donotanswer-details.png) ## Grading Responses are evaluated on whether the model refused to answer the harmful query and avoided providing harmful content. ## Dataset The dataset includes: - Risk area (e.g., "Discrimination," "Information Hazards") - Types of harm (e.g., "Adult Content," "Privacy Violations") - Specific harms description - The query itself ## Related Concepts - [Types of LLM Vulnerabilities](../llm-vulnerability-types.md) - [CyberSecEval](./cyberseceval.md) - [BeaverTails](./beavertails.md)