--- sidebar_label: WebSockets description: Configure WebSocket endpoints for real-time LLM inference with custom message templates, response parsing, and secure authentication for bidirectional API integration --- # WebSockets The WebSocket provider allows you to connect to a WebSocket endpoint for inference. This is useful for real-time, bidirectional communication. WebSockets are often used to stream messages that contain partial responses to improve the perceived performance of LLM applications. Promptfoo supports a range of implementations from servers that respond with a single message containing the full response, to those that stream a series of partial responses. ## Configuration To use the WebSocket provider, set the provider `id` to `websocket` and provide the necessary configuration in the `config` section. ```yaml providers: - id: 'wss://example.com/ws' config: messageTemplate: '{"prompt": "{{prompt}}", "model": "{{model}}"}' transformResponse: 'data.output' timeoutMs: 300000 headers: Authorization: 'Bearer your-token-here' ``` ### Configuration Options - `url` (required): The WebSocket URL to connect to. - `messageTemplate` (required): A template for the message to be sent over the WebSocket connection. You can use placeholders like `{{prompt}}` which will be replaced with the actual prompt. - `transformResponse` (optional): A JavaScript snippet or function to extract the desired output from the WebSocket response given the `data` parameter. If not provided, the entire response will be used as the output. If the response is valid JSON, the object will be returned. - `streamResponse` (optional): A JavaScript function to extract the desired output from streamed WebSocket messages when the server sends multiple messages per prompt. It receives `(accumulator, data, context?)` and must return `[nextAccumulator, complete]`. When `streamResponse` is provided, it is used instead of `transformResponse`. - `timeoutMs` (optional): The timeout in milliseconds for the WebSocket connection. Default is 300000 (5 minutes). - `headers` (optional): A map of HTTP headers to include in the WebSocket connection request. Useful for authentication or other custom headers. ## Using Variables You can use test variables in your `messageTemplate`: ```yaml providers: - id: 'wss://example.com/ws' config: messageTemplate: '{"prompt": {{ prompt | dump }}, "model": {{ model | dump }}, "language": {{ language | dump }} }' transformResponse: 'data.translation' tests: - vars: model: 'gpt-4' language: 'French' ``` ## Parsing the Response Use the `transformResponse` property to extract specific values from the WebSocket response. For example: ```yaml providers: - id: 'wss://example.com/ws' config: messageTemplate: '{"prompt": {{ prompt | dump }} }' transformResponse: 'data.choices[0].message.content' ``` This configuration extracts the message content from a response structure similar to: ```json { "choices": [ { "message": { "content": "This is the response." } } ] } ``` ## Streaming Responses Some WebSocket endpoints stream their replies as multiple messages (for example, token-by-token deltas) before sending a final completion. Use `streamResponse` to handle these incremental messages and decide when you're done. ### How `streamResponse` works - It is called for every incoming WebSocket message and receives: - `accumulator`: the current accumulated result. This should be a `ProviderResponse`-shaped object, e.g. `{ output: string }`. - `data`: the raw WebSocket message event. Access the payload via `data.data`. If your server sends JSON, you will typically start by parsing this such as: `JSON.parse(data.data)`. - `context` (optional): the call context from `callApi`, including test vars and flags. - It must return a tuple `[result, complete]` where: - `result`: the updated accumulated result you want to carry forward. - `complete` (boolean): set `true` only when you’ve received the final message and want to stop streaming and return the result. When `complete` is `false`, promptfoo keeps the WebSocket open and waits for the next message. When `true`, the connection is closed and `result` is returned (after being normalized as a `ProviderResponse`). :::info `data` is the browser/Node `MessageEvent`. Most servers send the useful payload in `data.data` as a string. Parse it if needed: ```js const message = typeof data.data === 'string' ? JSON.parse(data.data) : data.data; ``` ::: ### Example: Concatenate streamed chunks into a single answer Imagine your server streams JSON like this while writing a travel suggestion: ```json {"type":"chunk","text":"You should visit "} {"type":"chunk","text":"Kyoto in spring."} {"type":"done"} ``` Here’s a `streamResponse` that concatenates the `text` fields until a `type: done` arrives: ```yaml providers: - id: 'wss://example.com/ws' config: messageTemplate: '{"prompt": {{ prompt | dump }} }' streamResponse: | (accumulator, data, context) => { const msg = typeof data.data === 'string' ? JSON.parse(data.data) : data.data; const previous = typeof accumulator?.output === 'string' ? accumulator.output : ''; if (msg?.type === 'chunk' && typeof msg.text === 'string') { return [{ output: previous + msg.text }, false]; } if (msg?.type === 'done') { return [{ output: previous }, true]; } return [accumulator, false]; } ``` This will return a single final string: "You should visit Kyoto in spring." once the `done` message is received. ### Example: Filter out non-final messages using a `complete` flag Many realtime APIs emit interim deltas and a final message that includes `complete: true`. Suppose the stream contains a friendly recipe generation convo like: ```json {"role":"assistant","event":"delta","content":"Start by sautéing onions...","complete":false} {"role":"assistant","event":"delta","content":" then add tomatoes and simmer.","complete":false} {"role":"assistant","event":"final","content":"Start by sautéing onions, then add tomatoes and simmer.","complete":true} ``` If you only want to score the finished answer (not each partial), set `complete` to `true` only on the final frame and ignore everything else: ```yaml providers: - id: 'wss://example.com/ws' config: messageTemplate: '{"prompt": {{ prompt | dump }} }' streamResponse: | (accumulator, data, context) => { const msg = typeof data.data === 'string' ? JSON.parse(data.data) : data.data; if (msg?.complete === true) { return [{ output: msg.content }, true]; } // Not complete yet — keep waiting and keep the previous accumulator return [accumulator, false]; } ``` ### Example: Accumulate partials and still stop on `complete` Sometimes you want the best of both worlds: concatenate partials for UI preview, but only finalize when the API says it’s done. A common pattern for customer support answers: ```yaml providers: - id: 'wss://example.com/ws' config: messageTemplate: '{"prompt": {{ prompt | dump }} }' streamResponse: | (accumulator, data, context) => { const msg = typeof data.data === 'string' ? JSON.parse(data.data) : data.data; const previous = typeof accumulator?.output === 'string' ? accumulator.output : ''; if (msg?.event === 'delta' && typeof msg.content === 'string') { return [{ output: previous + msg.content }, false]; } if (msg?.complete === true) { return [{ output: previous }, true]; } return [accumulator, false]; } ``` ### Referencing a function from a file For larger handlers, keep the logic in a file and reference it: ```yaml providers: - id: 'wss://example.com/ws' config: messageTemplate: '{"prompt": {{ prompt | dump }} }' streamResponse: 'file://scripts/wsStreamHandler.js' ``` You can also point to a named export: `file://scripts/wsStreamHandler.js:myHandler`. ## Using as a Library If you are using promptfoo as a node library, you can provide the equivalent provider config: ```js { // ... providers: [{ id: 'wss://example.com/ws', config: { messageTemplate: '{"prompt": "{{prompt}}"}', transformResponse: (data) => data.foobar, timeoutMs: 15000, } }], } ``` Note that when using the WebSocket provider, the connection will be opened for each API call and closed after receiving the response or when the timeout is reached. ## Reference Supported config options: | Option | Type | Description | | ----------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | url | string | The WebSocket URL to connect to. If not provided, the `id` of the provider will be used as the URL. | | messageTemplate | string | A template string for the message to be sent over the WebSocket connection. Supports Nunjucks templating. | | transformResponse | string | A function body or string to parse a single response. Ignored when `streamResponse` is provided. | | streamResponse | Function | A function body, function expression, or `file://` reference that receives `(accumulator, data, context?)` and returns `[result, complete]` for streamed messages. | | timeoutMs | number | The timeout in milliseconds for the WebSocket connection. Defaults to 300000 (5 minutes) if not specified. | | headers | object | A map of HTTP headers to include in the WebSocket connection request. Useful for authentication or other custom headers. | Note: The `messageTemplate` supports Nunjucks templating, allowing you to use the `{{prompt}}` variable or any other variables passed in the test context. In addition to a full URL, the provider `id` field accepts `ws`, `wss`, or `websocket` as values. :::info If you're using the OpenAI Realtime provider, you can configure custom endpoints via `apiBaseUrl` (or env vars). The provider automatically converts `https://` → `wss://` and `http://` → `ws://`. See the OpenAI docs: `/docs/providers/openai/#custom-endpoints-and-proxies-realtime`. :::