--- sidebar_position: 50 description: 'Measure LLM faithfulness to source context by detecting unsupported claims in responses.' --- # Context faithfulness Checks if the LLM's response only makes claims that are supported by the provided context. **Use when**: You need to ensure the LLM isn't adding information beyond what was retrieved. **How it works**: Extracts factual claims from the response, then verifies each against the context. Score = supported claims / total claims. **Example**: ```text Context: "Paris is the capital of France." Response: "Paris, with 2.2 million residents, is France's capital." Score: 0.5 (capital ✓, population ✗) ``` ## Configuration ```yaml assert: - type: context-faithfulness threshold: 0.9 # Require 90% of claims to be supported ``` ### Required fields - `query` - User's question (in test vars) - `context` - Reference text (in vars or via `contextTransform`) - `threshold` - Minimum score 0-1 (default: 0) ### Full example ```yaml tests: - vars: query: 'What is the capital of France?' context: 'Paris is the capital and largest city of France.' assert: - type: context-faithfulness threshold: 0.9 ``` ### Array context Context can also be an array: ```yaml tests: - vars: query: 'Tell me about France' context: - 'Paris is the capital and largest city of France.' - 'France is located in Western Europe.' - 'The country has a rich cultural heritage.' assert: - type: context-faithfulness threshold: 0.8 ``` ### Dynamic context extraction For RAG systems that return context with their response: ```yaml # Provider returns { answer: "...", context: "..." } assert: - type: context-faithfulness contextTransform: 'output.context' # Extract context field threshold: 0.9 ``` ### Custom grading Override the default grader: ```yaml assert: - type: context-faithfulness provider: gpt-5 # Use a different model for grading threshold: 0.9 ``` ## Limitations - Depends on judge LLM quality - May miss implicit claims - Performance degrades with very long contexts ## Related metrics - [`context-relevance`](/docs/configuration/expected-outputs/model-graded/context-relevance) - Is retrieved context relevant? - [`context-recall`](/docs/configuration/expected-outputs/model-graded/context-recall) - Does context support the expected answer? ## Further reading - [Defining context in test cases](/docs/configuration/expected-outputs/model-graded#defining-context) - [RAG Evaluation Guide](/docs/guides/evaluate-rag)