July
Select multiple Versions when creating an Evaluation
July 30th, 2024
Our Evaluations feature allows you to benchmark Versions of a same File. We’ve made the form for creating new Evaluations simpler by allowing the selection of multiple in the picker dialog. Columns will be filled or inserted as needed.
As an added bonus, we’ve made adding and removing columns feel smoother with animations. The form will also scroll to newly-added columns.
Faster log queries
July 19th, 2024
You should notice that queries against your logs should load faster and the tables should render more quickly.
We’re still making more enhancements so keep an eye for more speed-ups coming soon!
gpt-4o-mini support
July 18th, 2024
Latest model from OpenAI, GPT-4o-mini, has been added. It’s a smaller version of the GPT-4o model which shows GPT-4 level performance with a model that is 60% cheaper than gpt-3.5-turbo.
- Cost: 15 cents per million input tokens, 60 cents per million output tokens
- Performance: MMLU score of 82%
Enhanced code Evaluators
July 10th, 2024
We’ve introduced several enhancements to our code Evaluator runtime environment to support additional packages, environment variables, and improved runtime output.
Runtime environment
Our Code Evaluator now logs both stdout
and stderr
when executed and environment variables can now be accessed via the os.environ
dictionary, allowing you to retrieve values such as os.environ['HUMANLOOP_API_KEY']
or os.environ['PROVIDER_KEYS']
.
Python packages
Previously, the selection of Python packages we could support was limited. We are now able to accommodate customer-requested packages. If you have specific package requirements for your eval workflows, please let us know!