January
Get File by path
January 23rd, 2025
You can now retrieve a File by its path with the Humanloop API, simplifying how you interact with Files when prototyping or setting up Evaluations.
This new endpoint can be used with our SDKs (ensure you have updated versions) and via the API directly.
In your production systems, we still recommend interacting with Files using their ID, as the path may change if the File (or its parent directories) is moved or renamed.
Duplicate Datapoints
January 23rd, 2025
You can now duplicate Datapoints from the UI Editor. This is useful if creating lots of variations of an existing Datapoint.
Alternatively, you can use the .csv upload feature to create multiple Datapoints at once.
Aggregate stats for Eval Runs
January 18th, 2025
We’ve added aggregate statistics to the Runs table to help you quickly compare performance across different Evaluators. You can view these statistics in the Runs tab of any Evaluation that contains Evaluators.
For boolean Evaluators, we show the percentage of true
judgments. For number Evaluators, we display the average value.
For select and multi-select Evaluators, we display a bar chart showing the distribution of the judgments.
Additional icons indicate the status of the Run, relevant to the aggregate stat:
- A spinning icon indicates that not all Logs have judgments, and the Run is currently being executed. The displayed aggregate statistic may not be final.
- A clock icon shows that not all Logs have judgments, though the Run is not currently being executed
- A red warning icon indicates errors when running the Evaluator
Hover over these icons or aggregate statistics to view more details in the tooltip, such as the number of judgments and the number of errors (if any).
Filter Eval Runs
January 14th, 2025
You can now more easily compare your relevant Runs by selecting them in the Runs tab.
To filter to a subset of Runs, go to the Runs tab and select them by clicking the checkbox or by pressing x
with your cursor on the row.
Then, go to the Stats or Review tab to see the comparison between the selected Runs. Your control Run will always be included in the comparison.
Filter by Judgement in Review tab
January 9th, 2025
You can now filter Logs by Evaluator judgments in the Review tab of an Evaluation. This feature allows you to quickly retrieve specific Logs, such as those marked as “Good” or “Bad” by a subject-matter expert, or those with latency below a certain threshold.
To filter Logs, click on the Filter button on the Review tab to set up your first filter.
Template Library
January 4th, 2025
We’ve introduced the first version of our Template Library, designed to help you get started with example projects on Humanloop.
This new feature allows you to browse and search for relevant templates using tags. You can then clone templates into your workspace to help overcome the cold-start problem.
This first release focuses on providing useful Evaluator examples alongside a set of curated datasets from Hugging Face. In upcoming releases, we plan to expand the library with additional Agent and RAG templates for a wide range of use cases. Stay tuned for more updates!