--- sidebar_label: Evaluating LLM Text-to-SQL Performance description: Compare text-to-SQL accuracy across GPT-5-mini and GPT-5 using automated test cases and schema validation to optimize database query generation performance --- # Evaluating LLM text-to-SQL performance Promptfoo is a command-line tool that allows you to test and validate text-to-SQL conversions. This guide will walk you through setting up an eval harness that will help you improve the quality of your text-to-SQL prompts. The end result is a view that looks like this: ![text to sql evaluation](/img/docs/text-to-sql-eval.png) ## Configuration Start by creating a `promptfooconfig.yaml` file. ### Step 1: Define the Prompt(s) Specify the text prompts that will be used to generate the SQL queries. Use `{{placeholders}}` for variables that will be replaced with actual values during testing. ```yaml prompts: - | Output a SQL query that returns the number of {{product}} sold in the last month. Database schema: {{database}} Only output SQL code. ``` If you'd like, you can reference prompts in an external file: ```yaml: prompts: - file://path/to/my_prompt.txt - file://path/to/another_prompt.json ``` ### Step 2: Specify the Providers Define one or more language model providers to use. For example, here we compare the performance between GPT-5-mini and GPT-5: ```yaml providers: - openai:gpt-5-mini - openai:gpt-5 ``` A wide variety of LLM APIs are supported, including local models. See [providers](/docs/providers) for more information. ### Step 3: Define the Tests Create test cases to validate the generated SQL queries. Each test case includes: - **vars**: Variables used in the prompt template. - **assert**: Assertions to used to validate the output. #### Basic SQL Validation This test checks produces a query for `bananas` (remember our prompt above) and confirms that the generated output is valid SQL. ```yaml - vars: product: bananas database: file://database.sql assert: - type: is-sql ``` :::tip Use `contains-sql` instead of `is-sql` to allow responses that contain text with SQL code blocks. ::: #### Table-Specific SQL Validation This test ensures the SQL query only uses specified tables (`Products` and `Shipments`). ```yaml - vars: product: apples database: file://database.sql assert: - type: is-sql value: databaseType: 'MySQL' allowedTables: - select::null::Products - select::null::Shipments ``` The format for allowed notation is ` {type}::{tableName}::{columnName}`, and `null` can be used to allow any. #### Column-Specific SQL Validation This test is expected to fail since the `DoesntExist` column is not present in the database: ```yaml - vars: product: oranges database: file://database.sql assert: - type: is-sql value: databaseType: 'MySQL' allowedColumns: - select::null::DoesntExist ``` ### Step 4: Define the Database Schema Define the structure of your database in a separate SQL file (`database.sql`). ```sql CREATE DATABASE IF NOT EXISTS ShipmentSystem; USE ShipmentSystem; CREATE TABLE IF NOT EXISTS Products ( product_id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(255) NOT NULL, description TEXT, price DECIMAL(10, 2) NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); CREATE TABLE IF NOT EXISTS Shipments ( shipment_id INT AUTO_INCREMENT PRIMARY KEY, product_id INT NOT NULL, quantity INT NOT NULL, shipment_date DATE NOT NULL, status VARCHAR(50) NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (product_id) REFERENCES Products(product_id) ); CREATE TABLE IF NOT EXISTS ShipmentDetails ( detail_id INT AUTO_INCREMENT PRIMARY KEY, shipment_id INT NOT NULL, location VARCHAR(255) NOT NULL, status VARCHAR(50) NOT NULL, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, FOREIGN KEY (shipment_id) REFERENCES Shipments(shipment_id) ); ``` ### Final Configuration Combine all the steps into a final configuration file (`promptfooconfig.yaml`): ```yaml description: 'Is-SQL example' prompts: - | Output a SQL query that returns the number of {{product}} sold in the last month. Database schema: {{database}} Only output SQL code. providers: - openai:gpt-5-mini tests: - vars: product: bananas database: file://database.sql assert: - type: is-sql - vars: product: apples database: file://database.sql assert: - type: is-sql value: databaseType: 'MySQL' allowedTables: - select::null::Products - select::null::Shipments - vars: product: oranges database: file://database.sql assert: - type: is-sql value: databaseType: 'MySQL' allowedColumns: - select::null::DoesntExist ``` ## Running Tests Run your tests: ``` npx promptfoo@latest eval ``` This will generate a summary of outputs in your terminal. ## Review results Use the web viewer: ``` npx promptfoo@latest view ``` This will open your test results and allow you to refine your prompts and compare model performance. ![text to sql evaluation](/img/docs/text-to-sql-eval.png)