For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Sign inBook a demo
DocsReferenceChangelog
DocsReferenceChangelog
  • Getting Started
    • Overview
    • Why Humanloop?
    • Quickstart Tutorial
  • Tutorials
    • Create your first GPT-4 App
    • ChatGPT clone with streaming
  • Guides
    • Create a Prompt
      • Overview
      • Run an experiment
      • Run experiments managing your own model
    • Fine-tune a model
    • Manage API keys
    • Invite collaborators
    • Deploy to environments
  • Core concepts
    • Prompts
    • Tools
    • Datasets
    • Evaluators
    • Logs
    • Environments
    • Key Concepts
  • Reference
    • Supported Models
    • Access Roles
    • .prompt files
    • Postman Workspace
LogoLogo
Sign inBook a demo
On this page
  • Prerequisites
  • Create an experiment
  • Set the experiment live
  • Monitor experiment progress
GuidesExperiments

Run an experiment

This guide shows you how to experiment with Humanloop to systematically find the best-performing model configuration for your project based on your end-user’s feedback.

Was this page helpful?
Previous

Run experiments managing your own model

How to set up an experiment on Humanloop using your own model.
Next
Built with

Experiments can be used to compare different prompt templates, parameter combinations (such as temperature and presence penalties), and even base models.

Prerequisites

  • You already have a Prompt — if not, please follow our Prompt creation guide first.
  • You have integrated humanloop.complete_deployed() or the humanloop.chat_deployed() endpoints, along with the humanloop.feedback() with the API or Python SDK.

This guide assumes you’re using an OpenAI model. If you want to use other providers or your model, refer to the guide for running an experiment with your model provider.

Create an experiment

1

Navigate to the Experiments tab of your Prompt

2

Click the Create new experiment button

  1. Give your experiment a descriptive name.
  2. Select a list of feedback labels to be considered as positive actions - this will be used to calculate the performance of each of your model configs during the experiment.
  3. Select which of your project’s model configs to compare.
  4. Then click the Create button.

Set the experiment live

Now that you have an experiment, you need to set it as the project’s active experiment:

1

Navigate to the Experiments tab.

Of a Prompt go to the Experiments tab.

2

Choose the Experiment card you want to deploy.

3

Click the Deploy button

Next to the Environments label, click the Deploy button.

4

Select the environment to deploy the experiment.

We only have one environment by default so select the ‘production’ environment.

Now that your experiment is active, any SDK or API calls to generate will sample model configs from the list you provided when creating the experiment and any subsequent feedback captured using feedback will contribute to the experiment performance.

Monitor experiment progress

Now that an experiment is live, the data flowing through your generate and feedback calls will update the experiment progress in real-time:

1

Navigate back to the Experiments tab.

2

Select the Experiment card

Here you will see the performance of each model config with a measure of confidence based on how much feedback data has been collected so far:

You can toggle on and off existing model configs and choose to add new model configs from your project over the lifecycle of an experiment

🎉 Your experiment can now give you insight into which of the model configs your users prefer.

How quickly you can draw conclusions depends on how much traffic you have flowing through your project.

Generally, you should be able to draw some initial conclusions after on the order of hundreds of examples.