# Test Suite **What this is:** Vitest-based test suite for core promptfoo functionality. ## Running Tests ```bash # Run all tests npm test # Run specific test file npx vitest run providers/openai # Run tests matching pattern npx vitest run -t "should handle errors" # Run in watch mode npm run test:watch # Run integration tests npm run test:integration ``` ## Critical Rules - **NEVER** increase test timeouts - fix the slow test - **NEVER** use `.only()` or `.skip()` in committed code - **ALWAYS** clean up mocks in `afterEach` - Tests run in **random order by default** (configured in vitest.config.ts) - Use `--sequence.shuffle=false` to disable when debugging specific failures - Use `--sequence.seed=12345` to reproduce a specific order ## Writing Tests **Reference files:** - **Vitest (frontend)**: `src/app/src/hooks/usePageMeta.test.ts` - explicit imports - **Vitest (backend)**: `test/assertions/contains.test.ts` - explicit imports All tests require explicit imports from vitest: ```typescript import { afterEach, describe, expect, it, vi } from 'vitest'; afterEach(() => { vi.resetAllMocks(); // Prevents test pollution }); ``` ## Directory Structure Tests mirror `src/` structure: - `test/providers/` → `src/providers/` - `test/redteam/` → `src/redteam/` - `test/agentSkills/` → agent plugin contract tests; read `test/agentSkills/AGENTS.md` - `test/fixtures/agent-skills/` → runnable agent skill fixtures; read `test/fixtures/agent-skills/AGENTS.md` ## Mocking ```typescript import { vi } from 'vitest'; vi.mock('axios'); const axiosMock = vi.mocked(axios); axiosMock.post.mockResolvedValue({ data: { result: 'success' } }); ``` - Use Vitest's mocking utilities (`vi.mock`, `vi.fn`, `vi.spyOn`) - Prefer shallow mocking over deep mocking - Mock external dependencies but not the code being tested - Reset mocks between tests to prevent test pollution **Critical: Mock Isolation** `vi.clearAllMocks()` only clears call history, NOT mock implementations. Use `mockReset()` for full isolation: ```typescript beforeEach(() => { vi.clearAllMocks(); // Clears .mock.calls and .mock.results vi.mocked(myMock).mockReset(); // Also clears mockReturnValue/mockResolvedValue }); ``` For `vi.hoisted()` mocks or mocks with `mockReturnValue()`, you MUST call `mockReset()` in `beforeEach` to ensure test isolation when tests run in random order. ## Environment Variables Prefer `mockProcessEnv()` from `test/util/utils.ts` for root tests that need to change environment variables. Use `vi.stubEnv()` only when a test specifically needs Vitest's stub behavior, and pair it with `vi.unstubAllEnvs()`. Avoid direct `process.env.FOO = ...`, `delete process.env.FOO`, or `process.env = ...` mutations in new tests. The root hygiene suite blocks new files that use direct environment mutation because tests run in random order. ## Zustand Store Testing When testing components that use Zustand stores, prefer **integration testing with real stores** over mocking. ### When to Use Real Stores vs Mocking **Use real stores (preferred)** when: - Testing components that modify store state - Verifying state changes after user interactions - Testing store action logic (merge, update, delete) **Mock the store** only when: - Testing pure UI components that only read from stores - Testing isolated component logic completely unrelated to state ### Pattern: Integration Testing with Real Stores **Reference**: `src/app/src/pages/redteam/setup/components/PluginsTab.test.tsx` ```typescript import { act, render, screen, waitFor } from '@testing-library/react'; import userEvent from '@testing-library/user-event'; import { beforeEach, afterEach, describe, expect, test, vi } from 'vitest'; import { useMyStore } from './useMyStore'; // Capture initial state OUTSIDE describe block const initialState = useMyStore.getState(); describe('MyComponent', () => { beforeEach(() => { // Reset store to initial state before each test act(() => { useMyStore.setState(initialState); }); }); afterEach(() => { // Clean up after each test act(() => { useMyStore.setState(initialState); }); }); test('user interaction updates store correctly', async () => { const user = userEvent.setup(); // Verify initial state expect(useMyStore.getState().items).toHaveLength(0); render(); await user.click(screen.getByRole('button', { name: 'Add Item' })); // Assert on actual store state, not mock calls await waitFor(() => { expect(useMyStore.getState().items).toHaveLength(1); }); }); }); ``` ### Key Patterns 1. **Capture initial state outside describe**: `const initialState = useMyStore.getState();` 2. **Reset in beforeEach AND afterEach**: Ensures test isolation 3. **Wrap setState in act()**: Required for React state batching 4. **Assert on getState()**: Verify actual store state, not that a mock was called 5. **Use waitFor for async**: Store updates may be asynchronous 6. **Mock external dependencies only**: Mock APIs, analytics, child components—NOT the store ### Anti-Pattern: Mocking the Store (avoid this) ```typescript // ❌ AVOID: Mocking the store hook loses integration coverage vi.mock('./useMyStore'); const mockUpdateItems = vi.fn(); (useMyStore as any).mockReturnValue({ items: [], updateItems: mockUpdateItems, }); test('clicking button calls updateItems', () => { render(); fireEvent.click(button); // Only verifies mock was called, not that store logic works expect(mockUpdateItems).toHaveBeenCalled(); }); ``` ### Additional Store Test References - `src/app/src/store/providersStore.test.ts` - Basic store testing - `src/app/src/stores/evalConfig.test.ts` - Configuration state - `src/app/src/stores/userStore.test.ts` - Async operations with act() ## Provider Testing Every provider needs tests covering: 1. Success case 2. Error cases (4xx, 5xx, rate limits) 3. Configuration validation 4. Token usage tracking When a change introduces request-format gating, parser selection, or middleware validation, add negative-path tests for malformed inputs and assert the exact response contract. See `test/providers/openai-codex-sdk.test.ts` for reference patterns. ## Smoke Tests Smoke tests verify the **built CLI package** works correctly end-to-end. They test `dist/src/main.js` directly using `spawnSync`. ```bash # Run smoke tests npm run test:smoke # Run specific smoke test file npx vitest run test/smoke/cli.test.ts --config vitest.smoke.config.ts ``` **Location:** `test/smoke/` with fixtures in `test/smoke/fixtures/configs/` **Reference pattern** (from `test/smoke/filters-and-flags.test.ts`): ```typescript import { spawnSync } from 'child_process'; import * as fs from 'fs'; import * as path from 'path'; import { afterAll, beforeAll, describe, expect, it } from 'vitest'; const CLI_PATH = path.resolve(__dirname, '../../dist/src/main.js'); const FIXTURES_DIR = path.resolve(__dirname, 'fixtures/configs'); const OUTPUT_DIR = path.resolve(__dirname, '.temp-output'); function runCli(args: string[], options: { cwd?: string } = {}) { const result = spawnSync('node', [CLI_PATH, ...args], { cwd: options.cwd || path.resolve(__dirname, '../..'), encoding: 'utf-8', env: { ...process.env, NO_COLOR: '1' }, timeout: 60000, }); return { stdout: result.stdout || '', stderr: result.stderr || '', exitCode: result.status ?? 1, }; } describe('My Smoke Tests', () => { beforeAll(() => { if (!fs.existsSync(CLI_PATH)) { throw new Error(`Built CLI not found. Run 'npm run build' first.`); } fs.mkdirSync(OUTPUT_DIR, { recursive: true }); }); afterAll(() => { fs.rmSync(OUTPUT_DIR, { recursive: true, force: true }); }); it('runs eval with echo provider', () => { const configPath = path.join(FIXTURES_DIR, 'basic.yaml'); const outputPath = path.join(OUTPUT_DIR, 'output.json'); const { exitCode } = runCli(['eval', '-c', configPath, '-o', outputPath, '--no-cache']); expect(exitCode).toBe(0); const parsed = JSON.parse(fs.readFileSync(outputPath, 'utf-8')); expect(parsed.results.results[0].response.output).toContain('Hello'); }); }); ``` **Key patterns:** - Test the **built package** (`dist/src/main.js`), not source code - Use `spawnSync` with `node` to run the CLI directly - Fixtures go in `test/smoke/fixtures/configs/` (committed to git) - Temp output directories cleaned up in `afterAll` - Use `echo` provider for deterministic, zero-cost testing - Check exit codes AND output file contents **Fixture example** (`test/smoke/fixtures/configs/basic.yaml`): ```yaml providers: - echo prompts: - 'Hello {{name}}' tests: - vars: name: World assert: - type: contains value: World ``` See `docs/plans/smoke-tests.md` for the full test checklist. ## Test Configuration - Config: `vitest.config.ts` (unit tests), `vitest.integration.config.ts` (integration tests), `vitest.smoke.config.ts` (smoke tests) - Setup: `vitest.setup.ts` - Globals disabled: All test utilities must be explicitly imported from `vitest` - Import `describe`, `it`, `expect`, `beforeEach`, `afterEach`, `vi` from `vitest` ## Testing Code with Timers When testing code that uses `setTimeout`, `setInterval`, or `Date.now()`, use **fake timers** to make tests deterministic and fast. **Reference files:** - `test/scheduler/slotQueue.test.ts` - Comprehensive fake timer usage - `test/scheduler/providerRateLimitState.test.ts` - Async code with timers ### Basic Pattern ```typescript import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest'; describe('MyTimerCode', () => { beforeEach(() => { vi.useFakeTimers(); }); afterEach(() => { vi.useRealTimers(); }); it('handles timeout', () => { const callback = vi.fn(); setTimeout(callback, 1000); vi.advanceTimersByTime(1000); expect(callback).toHaveBeenCalled(); }); }); ``` ### Async Code with Timers When testing async functions that internally use timers, you cannot simply `await` them—the promise will hang because fake timers don't advance automatically. Use this pattern: ```typescript it('handles async retry with delay', async () => { // 1. Start the promise (don't await yet) const promise = myAsyncFunctionWithRetry(); // 2. Run all timers to completion await vi.runAllTimersAsync(); // 3. Now await the result const result = await promise; expect(result).toBe('success'); }); ``` ### Why Fake Timers Matter Real timers cause flaky tests because: - **Timer resolution varies**: Windows has ~15ms minimum resolution vs ~1-4ms on Linux/Mac - **Race conditions**: Code checking `Date.now()` multiple times can see different values if a millisecond boundary is crossed - **CI variability**: Tests may pass locally but fail under CI load **Anti-pattern** (causes flaky tests): ```typescript // ❌ AVOID: Real timers with small delays it('retries after delay', async () => { const result = await functionThatRetries({ retryAfterMs: 1 }); // May timeout randomly }); ``` **Correct pattern**: ```typescript // ✓ CORRECT: Fake timers with controlled advancement it('retries after delay', async () => { vi.useFakeTimers(); const promise = functionThatRetries({ retryAfterMs: 1000 }); await vi.runAllTimersAsync(); const result = await promise; vi.useRealTimers(); }); ``` ## Best Practices - Ensure all tests are independent and can run in any order - Clean up any test data or mocks after each test - Run the full test suite before committing changes - Test failures should be deterministic - For database tests, use in-memory instances or proper test fixtures - **Use fake timers** for any code involving `setTimeout`, `setInterval`, or timing-sensitive logic