For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Sign inBook a demo
DocsReferenceChangelog
DocsReferenceChangelog
  • Getting Started
    • Overview
    • Quickstart
  • Explanation
    • Integrating Humanloop
  • Tutorials
    • Evaluate an Agent in the UI
    • Evaluate an Agent in code
    • Evaluate a RAG app
    • Capture user feedback
  • How-To Guides
    • Migrating from Humanloop
      • Run an Evaluation via the UI
      • Run an Evaluation via the API
      • Upload a Dataset from CSV
      • Create a Dataset via the API
      • Create a Dataset from existing Logs
      • Set up a code Evaluator
      • Set up LLM as a Judge
      • Set up a Human Evaluator
      • Run a Human Evaluation
      • Manage multiple reviewers
      • Compare and Debug Prompts
      • Set up CI/CD Evaluations
      • Spot-check your Logs
      • Use external Evaluators
      • Evaluate external logs
  • Reference
    • Deployment Options
    • Supported Models
    • Template Library
    • Vercel AI SDK
    • .prompt and .agent Files
    • Humanloop Runtime Environment
    • Security and Compliance
    • Data Management
    • Access roles (RBACs)
    • SSO and Authentication
    • LLMs.txt
LogoLogo
Sign inBook a demo
On this page
  • Prerequisites
  • Compare Prompt versions
  • View Prompt diff for debugging
How-To GuidesEvaluation

Compare and Debug Prompts

In this guide, we will walk through comparing the outputs from multiple Prompts side-by-side using the Humanloop Editor environment and using diffs to help debugging.

Was this page helpful?
Previous

Set up CI/CD Evaluations

In this guide, we will walk through setting up CI/CD integration for Humanloop evaluations using GitHub Actions.

Next
Built with

You can compare Prompt versions interactively side-by-side to get a sense for how their behaviour differs; before then triggering more systematic Evaluations. All the interactions in Editor are stored as Logs within your Prompt and can be inspected further and added to a Dataset for Evaluations.

Prerequisites

  • You already have a Prompt — if not, please follow our Prompt creation guide first.

Compare Prompt versions

In this example we will use a simple support agent Prompt that answers user queries about Humanloop’s product and docs.

Support agent base prompt.
1

Create a new version of your Prompt

Open your Prompt in the Editor. Under Parameters, change some details such as the choice of Model. In this example, we change from gpt-4o to gpt-4o-mini.

Support agent change model

Now save the new version of your Prompt by selecting the Save button in the top right and optionally provide a helpful version name (e.g. “Simple Support Agent v2”) and/or description (e.g. “Changed model to gpt-4o-mini”).

2

Load up two versions of your Prompt in the Editor

To load up the previous version side-by-side, select the menu beside the Load button and select the New panel option (depending on your screen real-estate, you can add more than 2 panels).

Support agent add panel

Then press the Load button in the new panel and select another version of your Prompt to compare.

Support agent load version
3

Compare the outputs of both versions

Now you can run the same user messages through both models to compare their behaviours live, side-by-side.

Support agent compare version

View Prompt diff for debugging

When debugging more complex Prompts, it’s important to understand what changes were made between different versions. Humanloop provides a diff view to support this.

1

Navigate to your Prompt dashboard

In the sidebar, select the Dashboard section under your Prompt file, where you will find a table of all your Prompt versions.

Support agent dashboard
2

Select the versions to compare

In the table, select two rows you would like to see the changes between. Then select the Show diff button above the table.

Support agent diff view
  1. In the comparison view, you’ll see a diff that highlights the changes between the selected versions.
  2. The diff shows additions in green and deletions in red. Modified content appears as a combination of red (for removed text) and green (for added text).
  3. Use this view to understand how specific changes affect the output.

By following these steps, you can effectively compare different versions of your Prompts and iterate on your instructions to improve performance.