August 13, 2024

Evaluations improvements

We’ve made improvements to help you evaluate the components of your AI applications, quickly see issues and explore the full context of each evaluation.

A clearer Evaluation tab in Logs

We’ve given the Log drawer’s Evaluation tab a facelift. You can now clearly see what the results are for each of the connected Evaluators.

This means that it’s now easier to debug the judgments applied to a Log, and if necessary, re-run code/AI Evaluators in-line.

Log drawer's Evaluation tab with the "Run again" menu open

Ability to re-run Evaluators

We have introduced the ability to re-run your Evaluators against a specific Log. This feature allows you to more easily address and fix issues with previous Evaluator judgments for specific Logs.

You can request a re-run of that Evaluator by opening the menu next to that Evaluator and pressing the “Run Again” option.

Evaluation popover

If you hover over an evaluation result, you’ll now see a popover with more details about the evaluation including any intermediate results or console logs without context switching.

Evaluation popover

Updated Evaluator Logs table

The Logs table for Evaluators now supports the functionality as you would expect from our other Logs tables. This will make it easier to filter and sort your Evaluator judgments.