# compare-agentic-sdks (Agentic SDK Comparison) Compare OpenAI Codex SDK, Claude Agent SDK, and OpenCode SDK on a security audit task. ## Quick Start ```bash npx promptfoo@latest init --example compare-agentic-sdks npx promptfoo eval npx promptfoo view ``` ## What This Compares Four providers analyze an intentionally vulnerable Python codebase: | Provider | How It Works | Output | | -------------------- | -------------------------------------------- | --------------------- | | **Codex SDK** | Reads files implicitly, uses `output_schema` | Structured JSON | | **Claude Agent SDK** | Uses Read/Grep/Glob tools explicitly | Natural language | | **OpenCode SDK** | Uses read/grep/glob tools, provider-agnostic | Natural language | | **Plain LLM** | No file access (baseline) | Explains how to audit | ## Vulnerabilities Planted The vulnerable code lives in the [test-codebase](./test-codebase/) directory. **`user_service.py`:** - MD5 password hashing - Timing attack in authentication - Predictable session tokens **`payment_processor.py`:** - Float for currency (precision loss) - PCI-DSS violations (storing CVV) - Sensitive data in logs ## Key Differences **Codex SDK** returns structured JSON matching the schema. Fast, predictable, good for automation. OpenAI only. **Claude Agent SDK** uses file system tools to explore, returns natural language. More flexible, shows reasoning. Anthropic only. **OpenCode SDK** uses file system tools similar to Claude Agent SDK, but supports 75+ LLM providers including Anthropic, OpenAI, Google, Ollama (local), and more. **Plain LLM** can't read files, so it explains how to do a security audit instead of doing one. ## Learn More - [Evaluate Coding Agents Guide](/docs/guides/evaluate-coding-agents) - [OpenAI Codex SDK](/docs/providers/openai-codex-sdk) - [Claude Agent SDK](/docs/providers/claude-agent-sdk) - [OpenCode SDK](/docs/providers/opencode-sdk)