# config-websockets/streaming (WebSocket Streaming) This example shows how to configure a websocket application that streams its responses. It includes a small Node.js server that exposes two WebSocket endpoints: - A non-streaming endpoint (`/ws`) that returns a single message when the model finishes. - A streaming endpoint (`/ws-stream`) that sends incremental deltas and a final message. You’ll run the server locally and use promptfoo’s eval command to test the quality of the application. You can run this example with: ```bash npx promptfoo@latest init --example config-websockets/streaming cd config-websockets/streaming ``` ## What’s in this folder - `promptfooconfig.yaml` – Configures a target pointing at the local WebSocket server using the streaming endpoint - `server/` – Minimal Express + WebSocket server that calls the OpenAI Responses API and exposes the two endpoints ## Prerequisites - Node.js 20+ - An OpenAI API key set as `OPENAI_API_KEY` ## 1) Start the local WebSocket server From this directory: ```bash cd server npm install # Option A: set environment variables in your shell export OPENAI_API_KEY=your_key_here # Optional: # export CHATBOT_MODEL=gpt-4.1-mini # defaults to gpt-4.1-mini # export PORT=3300 # defaults to 3300 # Start the server npm start ``` You should see the server listening at `http://localhost:3300`. Health check: ```bash curl http://localhost:3300/health # {"status":"ok"} ``` WebSocket Endpoints: - `ws://localhost:3300/ws` – non-streaming - `ws://localhost:3300/ws-stream` – streaming (sends `delta` updates and a final `message`) ## 2) How the WebSocket configuration works In `promptfooconfig.yaml`, the websocket endpoint is configured under the websocket endpoint id: ```yaml - id: 'ws://localhost:3300/ws-stream' ``` The target configuration uses the streamResponse function `streamResponse(accumulator, data, context?)` to decide when to stop and what to return. ## Server Response Format The server three types of messages: 1. `delta` messages that include a partial response 2. `message` messages that include the finalized response in full 3. `error` messages that indicate an error occurred Example of a successful message stream: ```json {"type":"delta","message":"Part of a thought"} {"type":"message","message":"Part of a thought, now the thought is completed"} ``` The streamResponse function includes logic for handling these different cases. Note: the `delta` case is the fallback, which returns false for the second item in the tuple to indicate the response is not yet complete: ```yaml - id: 'ws://localhost:3300/ws-stream' config: messageTemplate: '{"input": {{prompt | dump}}}' streamResponse: | (accumulator, event, context) => { const { message, type } = JSON.parse(event.data); if (type === 'message') { return [{ output: message }, true]; } if (type === 'error') { return [{ error: message }, true]; } return [{output: message}, false]; } ``` Tip: If you need to concatenate partials for UX, you can return an accumulator object with the concatenated value on `delta` frames and only return `true` when you receive the final message. ## 3) Run the evaluation With the server running, open a new terminal at this example directory and run: ```bash promptfoo eval ``` This will evaluate the test cases against the streaming WebSocket endpoint. View results in the browser UI: ```bash promptfoo view ``` ## Troubleshooting - If requests fail immediately, ensure `OPENAI_API_KEY` is set in the environment where the server is running. - If the client can’t connect, verify the server is listening on the expected port (`PORT`, defaults to 3300) and that you’re using the correct `ws://` URL. - For streaming behavior, watch the server logs and confirm you’re receiving `delta` events followed by a final `message`. ## Cleanup Stop the server with `Ctrl+C` in its terminal.