August 14, 2023
Offline evaluations with testsets
We’re continuing to build and release more functionality to Humanloop’s evaluations framework!
Our first release provided the ability to run online evaluators in your projects. Online evaluators allow you to monitor the performance of your live deployments by defining functions which evaluate all new datapoints in real time as they get logged to the project.
Today, to augment online evaluators, we are releasing offline evaluators as the second part of our evaluations framework.
Offline evaluators provide the ability to test your prompt engineering efforts rigorously in development and CI. Offline evaluators test the performance of your model configs against a pre-defined suite of testcases - much like unit testing in traditional programming.
With this framework, you can use test-driven development practices to iterate and improve your model configs, while monitoring for regressions in CI.
To learn more about how to use online and offline evaluators, check out the Evaluate your model section of our guides.