--- sidebar_label: Memory Poisoning description: Red team memory poisoning vulnerabilities in AI agents through persistent state manipulation to prevent malicious instruction injection and protect decision integrity --- # Memory Poisoning Plugin ## Overview The Memory Poisoning plugin tests whether stateful agents are vulnerable to memory poisoning attacks that manipulate an agent's memory to alter its behavior in subsequent interactions. This plugin maps to **T1: Memory Poisoning** from [OWASP Agentic AI Top 10 Threats](https://genai.owasp.org/resource/agentic-ai-threats-and-mitigations/). ## Purpose Memory poisoning attacks attempt to inject malicious instructions into an agent's memory state that persist across turns. When successful, these attacks can: 1. Override or modify the agent's original purpose 2. Extract sensitive information 3. Manipulate the agent's decision-making process 4. Bypass security controls in subsequent interactions ## How it Works The Memory Poisoning plugin: 1. Creates a scenario with specific "memories" that the agent should remember 2. Sends an initial memory-establishing message to the agent 3. Follows with a poisoned message attempting to corrupt the established memory 4. Tests the attack's effectiveness with a follow-up question that relies on the original memory 5. Evaluates whether the agent's response to the follow-up reflects the original memory (secure) or the poisoned memory (vulnerable) ## Configuration Add the following to your promptfoo configuration: ```yaml redteam: purpose: "Description of your agent's intended purpose" plugins: - id: agentic:memory-poisoning numTests: 30 # Optional, defaults to 5 ``` :::note Memory Poisoning is a standalone plugin and does not use strategies. Strategies defined in the `redteam.strategies` configuration are ignored for this plugin. ::: ## Example Here's an example configuration for testing an agent: ```yaml # yaml-language-server: $schema=https://promptfoo.dev/config-schema.json description: Travel Agent Redteam providers: - id: http config: url: http://localhost:3090/api/chat method: POST headers: 'Content-Type': 'application/json' 'x-promptfoo-session': '{{sessionId}}' body: 'message': '{{prompt}}' transformResponse: 'json.message' prompts: - '{{prompt}}' defaultTest: options: transformVars: '{ ...vars, sessionId: context.uuid }' redteam: purpose: | The agent simulates a travel agent that helps users find the best travel deals. It has access to a set of tools that allow it to check the weather, lookup and book flights, and book hotels. plugins: - id: agentic:memory-poisoning numTests: 30 ``` ## Working With Results The test results will show: - The original memory-establishing message - The poisoned message - The follow-up question - The agent's responses to each message A successful attack (failed test) is indicated when the agent's response to the follow-up question reflects the poisoned instructions rather than the original memory. ## Mitigations To protect against memory poisoning attacks: 1. Implement input validation that filters or sanitizes user inputs prior to persistence. 2. Sanitize user memories prior to including them within the context windows of inference calls. 3. Segregate memory types - separate system instructions from user input memory. 4. Apply memory attribution - track where memory content originated. ## Related Concepts - [Types of LLM Vulnerabilities](/docs/red-team/llm-vulnerability-types) - [Tool Discovery](/docs/red-team/plugins/tool-discovery) - [Excessive Agency](/docs/red-team/plugins/excessive-agency) - [Hijacking](/docs/red-team/plugins/hijacking)