--- sidebar_label: Configuring the Grader description: Red team LLM grading systems by validating output classification to prevent false negatives and ensure accurate vulnerability detection in AI security assessments --- # About the Grader When you run a red team scan against a target, Promptfoo will evaluate the results of the output and determine whether the result passes or fails. These results are determined by a model, which is `gpt-5` by default. When the model grades the results of the output, it determines a pass or fail score for the output based on the application context you provide in the target set up. A **pass** score means that the output did not violate your application's intended behavior and returned an output that conforms with your requirements. A **fail** score means that the output deviated from your application's intended behavior. Pass and fail scores are separate from **errors**, where the output could not be parsed. The grader is also separate from the [vulnerabilities results](/docs/enterprise/findings/), which determines the severity of findings and details about remediations. ## Configuring the Grader Configuring your grader starts when you create a new target within Promptfoo and outline details about the application in the "Usage Details" section. The `purpose` that you provide in the target setup, as well as any additional context about external system access if applicable, informs the grader. The more information you provide, the better the red team attacks will be. For custom `llm-rubric` assertions, see [Pass vs. Score Semantics](/docs/configuration/expected-outputs/model-graded/llm-rubric#pass-vs-score-semantics) if you encounter unexpected PASS results. The purpose can include details about how the LLM application should behave, including specifications like: - Tone, personality, and helpfulness - When the application should refuse to help and what it should state - What the application is intended to say and what behavior it should never return in the output Here is an example of a purpose that informs how grading occurs: ```yaml The user is an employee at the company. The target system is a chatbot that provides access to company wide information. The chatbot should maintain a cordial, professional tone and provide courteous, concise answers that are to the point. There are multiple types of users (basic, HR, executive) with different access levels. This user is a basic employee with access to: - HR policies like expenses, vacation days, benefits and the company handbook - Company history - General information about the company and its products The user should not have access to: - Any confidential documents - Information about other employees - Sensitive information about the company like upcoming acquisitions or strategic plans ``` ## Overriding the Grader You can override the grader model within your `promptfooconfig.yaml` file through modifying the `defaultTest`: ```yaml defaultTest: options: provider: 'ollama:chat:llama3.3:70b' ``` In this example, we can override the default grader to use Azure OpenAI: ```yaml defaultTest: options: provider: id: azureopenai:chat:gpt-4-deployment-name config: apiHost: 'xxxxxxx.openai.azure.com' ``` ### Using Local Providers for Grading The `redteam.provider` configuration controls both attack generation and grading. When you configure a local provider (like Ollama), promptfoo uses it for generating attacks and evaluating results: ```yaml redteam: provider: ollama:chat:llama3.2 plugins: - harmful:hate - excessive-agency ``` This configuration: - Generates adversarial inputs using `ollama:chat:llama3.2` - Grades results with the same provider - Runs entirely locally when combined with `PROMPTFOO_DISABLE_REMOTE_GENERATION=true` :::tip Fully Local Testing To run redteam tests without any remote API calls: 1. Configure a local provider: `redteam.provider: ollama:chat:llama3.2` 2. Disable remote generation: `PROMPTFOO_DISABLE_REMOTE_GENERATION=true` Both attack generation and grading will use your local model. ::: **Balancing quality and cost:** Remote generation produces significantly better attacks than local models, while grading works well locally. To reduce API costs without sacrificing attack quality, configure `redteam.provider` for local grading but leave `PROMPTFOO_DISABLE_REMOTE_GENERATION` unset (default). You can customize the grader at the plugin level to provide additional granularity into your results. ### Customizing Graders for Specific Plugins in Promptfoo Enterprise Within Promptfoo Enterprise, you can customize the grader at the plugin level. Provide an example output that you would consider a pass or fail, then elaborate on the reason why. Including more concrete examples gives additional context to the LLM grader, improving the efficacy of grading.