---
sidebar_label: Factuality
description: 'Validate factual accuracy of LLM responses using AI-powered fact-checking against verified knowledge bases and sources'
---

# Factuality

The `factuality` assertion evaluates the factual consistency between an LLM output and a reference answer. It uses a structured prompt based on [OpenAI's evals](https://github.com/openai/evals/blob/main/evals/registry/modelgraded/fact.yaml) to determine if the output is factually consistent with the reference.

## How to use it

To use the `factuality` assertion type, add it to your test configuration like this:

```yaml
assert:
  - type: factuality
    # Specify the reference statement to check against:
    value: The Earth orbits around the Sun
```

For non-English evaluation output, see the [multilingual evaluation guide](/docs/configuration/expected-outputs/model-graded#non-english-evaluation).

## How it works

The factuality checker evaluates whether completion A (the LLM output) and reference B (the value) are factually consistent. It categorizes the relationship as one of:

- **(A)** Output is a subset of the reference and is fully consistent
- **(B)** Output is a superset of the reference and is fully consistent
- **(C)** Output contains all the same details as the reference
- **(D)** Output and reference disagree
- **(E)** Output and reference differ, but differences don't matter for factuality

By default, options A, B, C, and E are considered passing grades, while D is considered failing.

## Example Configuration

Here's a complete example showing how to use factuality checks:

```yaml title="promptfooconfig.yaml"
prompts:
  - 'What is the capital of {{state}}?'
providers:
  - openai:gpt-5
  - anthropic:claude-sonnet-4-5-20250929
tests:
  - vars:
      state: California
    assert:
      - type: factuality
        value: Sacramento is the capital of California
  - vars:
      state: New York
    assert:
      - type: factuality
        value: Albany is the capital city of New York state
```

## Customizing Score Thresholds

You can customize which factuality categories are considered passing by setting scores in your test configuration:

```yaml
defaultTest:
  options:
    factuality:
      subset: 1 # Score for category A (default: 1)
      superset: 1 # Score for category B (default: 1)
      agree: 1 # Score for category C (default: 1)
      disagree: 0 # Score for category D (default: 0)
      differButFactual: 1 # Score for category E (default: 1)
```

## Overriding the Grader

Like other model-graded assertions, you can override the default grader:

1. Using the CLI:

   ```sh
   promptfoo eval --grader openai:gpt-5-mini
   ```

2. Using test options:

   ```yaml
   defaultTest:
     options:
       provider: anthropic:claude-sonnet-4-5-20250929
   ```

3. Using assertion-level override:

   ```yaml
   assert:
     - type: factuality
       value: Sacramento is the capital of California
       provider: openai:gpt-5-mini
   ```

## Customizing the Prompt

You can customize the evaluation prompt using the `rubricPrompt` property. The prompt has access to the following Nunjucks template variables:

- `{{input}}`: The original prompt/question
- `{{ideal}}`: The reference answer (from the `value` field)
- `{{completion}}`: The LLM's actual response (provided automatically by promptfoo)

Your custom prompt should instruct the model to either:

1. Return a single letter (A, B, C, D, or E) corresponding to the category, or
2. Return a JSON object with `category` and `reason` fields

Here's an example of a custom prompt:

```yaml
defaultTest:
  options:
    rubricPrompt: |
      Input: {{input}}
      Reference: {{ideal}}
      Completion: {{completion}}

      Evaluate the factual consistency between the completion and reference.
      Choose the most appropriate option:
      (A) Completion is a subset of reference
      (B) Completion is a superset of reference
      (C) Completion and reference are equivalent
      (D) Completion and reference disagree
      (E) Completion and reference differ, but differences don't affect factuality

      Answer with a single letter (A/B/C/D/E).
```

The factuality checker will parse either format:

- A single letter response like "A" or "(A)"
- A JSON object: `{"category": "A", "reason": "Detailed explanation..."}`

## Using Factuality with CSV

Use the `factuality:` prefix in `__expected` columns:

```csv title="tests.csv"
question,__expected
"What does GPT stand for?","factuality:Generative Pre-trained Transformer"
"What is photosynthesis?","factuality:Plants convert sunlight into chemical energy"
```

To apply factuality to all rows, see [CSV with defaultTest](/docs/configuration/test-cases#csv-with-defaulttest).

## See Also

- [Model-graded metrics](/docs/configuration/expected-outputs/model-graded) for more options
- [Guide on LLM factuality](/docs/guides/factuality-eval)