Run a Human Evaluation
Collect judgments from subject-matter experts (SMEs) to better understand the quality of your AI model outputs.
In this guide, we’ll show you how SMEs can provide judgments on Evaluation to help you understand and improve the quality of your model outputs.
Prerequisites
- You have set up a Human Evaluator appropriate for your use-case. If not, follow our guide to create a Human Evaluator.
Using a Human Evaluator in an Evaluation
Create a new Evaluation
Navigate to the Prompt you want to evaluate and click on the Evaluation tab at the top of the page. Click on Evaluate to create a new Evaluation.
Create a new Run
To evaluate a version of your Prompt, click on the +Run button, then select the version of the Prompt you want to evaluate and the Dataset you want to use. Click on +Evaluator to add a Human Evaluator to the Evaluation.
You can find example Human Evaluators in the Example Evaluators folder.
Click Save to create a new Run. Humanloop will start generating Logs for the Evaluation.
Next steps
- Learn how to manage reviews involving multiple SMEs
- To troubleshoot your Prompts, see our guide on Compare and Debug Prompts.