--- sidebar_position: 52 sidebar_label: Ruby description: Create advanced Ruby validation scripts with complex logic, external APIs, and custom libraries for sophisticated output grading --- # Ruby assertions The `ruby` assertion allows you to provide a custom Ruby method to validate the LLM output. A variable named `output` is injected into the context. The method should return `true` if the output passes the assertion, and `false` otherwise. If the method returns a number, it will be treated as a score. Example: ```yaml assert: - type: ruby value: output[5..9] == 'Hello' ``` You may also return a number, which will be treated as a score: ```yaml assert: - type: ruby value: Math.log10(output.length) * 10 ``` ## Multiline functions Ruby assertions support multiline strings: ```yaml assert: - type: ruby value: | # Insert your scoring logic here... if output == 'Expected output' return { 'pass' => true, 'score' => 0.5, } end return { 'pass' => false, 'score' => 0, } ``` ## Using test context A `context` object is available in the Ruby method. Here is its type definition: ```ruby # TraceSpan { 'spanId' => String, 'parentSpanId' => String | nil, 'name' => String, 'startTime' => Integer, # Unix timestamp in milliseconds 'endTime' => Integer | nil, # Unix timestamp in milliseconds 'attributes' => Hash | nil, 'statusCode' => Integer | nil, 'statusMessage' => String | nil } # TraceData { 'traceId' => String, 'spans' => Array[TraceSpan] } # AssertionValueFunctionContext { # Raw prompt sent to LLM 'prompt' => String | nil, # Test case variables 'vars' => Hash[String, String | Object], # The complete test case 'test' => Hash, # Contains keys like "vars", "assert", "options" # Log probabilities from the LLM response, if available 'logProbs' => Array[Float] | nil, # Configuration passed to the assertion 'config' => Hash | nil, # The provider that generated the response 'provider' => Object | nil, # ApiProvider type # The complete provider response 'providerResponse' => Object | nil, # ProviderResponse type # OpenTelemetry trace data (when tracing is enabled) 'trace' => TraceData | nil } ``` For example, if the test case has a var `example`, access it in Ruby like this: ```yaml tests: - description: 'Test with context' vars: example: 'Example text' assert: - type: ruby value: 'context["vars"]["example"] in output' ``` ## External .rb To reference an external file, use the `file://` prefix: ```yaml assert: - type: ruby value: file://relative/path/to/script.rb config: outputLengthLimit: 10 ``` You can specify a particular method to use by appending it after a colon: ```yaml assert: - type: ruby value: file://relative/path/to/script.rb:custom_assert ``` You can also specify a class method on some class or module in the file: ```yaml assert: - type: ruby value: file://relative/path/to/script.rb:Validators::Format.check_length ``` If no method is specified, it defaults to `get_assert`. This file will be called with an `output` string and an `AssertionValueFunctionContext` object (see above). It expects that either a `bool` (pass/fail), `float` (score), or `GradingResult` will be returned. Here's an example `assert.rb`: ```ruby require 'json' # Default function name def get_assert(output, context) puts 'Prompt:', context['prompt'] puts 'Vars', context['vars']['topic'] # This return is an example GradingResult hash { 'pass' => true, 'score' => 0.6, 'reason' => 'Looks good to me', } end # Custom function name def custom_assert(output, context) output.length > 10 end ``` This is an example of an assertion that uses data from a configuration defined in the assertion's YML file: ```ruby def get_assert(output, context) output.length <= context.fetch('config', {}).fetch('outputLengthLimit', 0) end ``` You can also return nested metrics and assertions via a `GradingResult` object: ```ruby { 'pass' => true, 'score' => 0.75, 'reason' => 'Looks good to me', 'componentResults' => [{ 'pass' => output.downcase.include?('bananas'), 'score' => 0.5, 'reason' => 'Contains banana', }, { 'pass' => output.downcase.include?('yellow'), 'score' => 0.5, 'reason' => 'Contains yellow', }] } ``` ### GradingResult types Here's a Ruby type definition you can use for the [`GradingResult`](/docs/configuration/reference/#gradingresult) object: ```ruby # GradingResult { 'pass' => Boolean, # Can also use 'pass_' if 'pass' conflicts with Ruby keywords 'score' => Float, 'reason' => String, 'componentResults' => Array[GradingResult] | nil, # Component results (optional) 'namedScores' => Hash[String, Float] | nil # Appear as metrics in the UI (optional) } ``` :::tip Snake case support Ruby snake_case fields are automatically mapped to camelCase: - `pass_` → `pass` (or just use `"pass"` as a hash key) - `named_scores` → `namedScores` - `component_results` → `componentResults` - `tokens_used` → `tokensUsed` ::: ## Using trace data When [tracing is enabled](/docs/tracing/), OpenTelemetry trace data is available in the `context['trace']` object. This allows you to write assertions based on the execution flow: ```ruby def get_assert(output, context) # Check if trace data is available unless context['trace'] # Tracing not enabled, skip trace-based checks return true end # Access trace spans spans = context['trace']['spans'] # Example: Check for errors in any span error_spans = spans.select { |s| s.fetch('statusCode', 0) >= 400 } if error_spans.any? return { 'pass' => false, 'score' => 0, 'reason' => "Found #{error_spans.length} error spans" } end # Example: Calculate total trace duration if spans.any? duration = spans.map { |s| s.fetch('endTime', 0) }.max - spans.map { |s| s['startTime'] }.min if duration > 5000 # 5 seconds return { 'pass' => false, 'score' => 0, 'reason' => "Trace took too long: #{duration}ms" } end end # Example: Check for specific operations api_calls = spans.select { |s| s['name'].downcase.include?('http') } if api_calls.length > 10 return { 'pass' => false, 'score' => 0, 'reason' => "Too many API calls: #{api_calls.length}" } end true end ``` Example YAML configuration: ```yaml tests: - vars: query: "What's the weather?" assert: - type: ruby value: | # Ensure retrieval happened before response generation if context['trace'] spans = context['trace']['spans'] retrieval_span = spans.find { |s| s['name'].include?('retrieval') } generation_span = spans.find { |s| s['name'].include?('generation') } if retrieval_span && generation_span return retrieval_span['startTime'] < generation_span['startTime'] end end true ``` ## Overriding the Ruby binary By default, promptfoo will run `ruby` in your shell. Make sure `ruby` points to the appropriate executable. If a `ruby` binary is not present, you will see a "ruby: command not found" error. To override the Ruby binary, set the `PROMPTFOO_RUBY` environment variable. You may set it to a path (such as `/path/to/ruby`) or just an executable in your PATH (such as `ruby`). ## Negation Use `not-ruby` to invert the final pass/fail result while preserving the returned score. Numeric scores are still compared against `threshold` before the result is inverted: ```yaml assert: - type: not-ruby value: output.include?('error') ``` ## Other assertion types For more info on assertions, see [Test assertions](/docs/configuration/expected-outputs).