For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Sign inBook a demo
DocsReferenceChangelog
DocsReferenceChangelog
  • Getting Started
    • Overview
    • Why Humanloop?
    • Quickstart Tutorial
  • Tutorials
    • Create your first GPT-4 App
    • ChatGPT clone with streaming
  • Guides
    • Create a Prompt
      • Overview
      • Run an evaluation
      • Set up evaluations using API
      • Use LLMs to evaluate logs
      • Self-hosted evaluations
      • Evaluating externally generated Logs
      • Evaluating with human feedback
      • Set up Monitoring
    • Fine-tune a model
    • Manage API keys
    • Invite collaborators
    • Deploy to environments
  • Core concepts
    • Prompts
    • Tools
    • Datasets
    • Evaluators
    • Logs
    • Environments
    • Key Concepts
  • Reference
    • Supported Models
    • Access Roles
    • .prompt files
    • Postman Workspace
LogoLogo
Sign inBook a demo
On this page
  • Prerequisites
  • Set up an LLM evaluator
  • Available variables
GuidesEvaluation and Monitoring

Use LLMs to evaluate logs

In this guide, we will set up an LLM evaluator to check for PII (Personally Identifiable Information) in Logs.

Was this page helpful?
Previous

Self-hosted evaluations

In this guide, we'll show how to run an evaluation in your own infrastructure and post the results to Humanloop.
Next
Built with

As well as using Python code to evaluate Logs, you can also create special-purpose prompts for LLMs to evaluate Logs too.

In this guide, we’ll show how to set up LLM evaluations.

Prerequisites

  • You need to have access to evaluations.
  • You also need to have a Prompt – if not, please follow our Prompt creation guide.
  • Finally, you need at least a few logs in your project. Use the Editor to generate some logs if you don’t have any yet.

Set up an LLM evaluator

1

From the Evaluations page, click New Evaluator and select AI.

2

From the presets menu on the left-hand side of the page, select PII.

3

Set the evaluator to Online mode, and toggle Auto-run to on. This will make the PII checker run on all new logs in the project.

The  **PII check** evaluator.
4

Click Create in the bottom left of the page.

5

Go to Editor and try generating a couple of logs, some containing PII and some without.

6

Go to the Logs table to review these logs.

The logs table, showing that the **PII check** evaluator ran on the latest logs.
7

Click one of the logs to see more details in the drawer.

In our example below, you can see that the the log did contain PII, and the PII check evaluator has correctly identified this and flagged it with False.

8

Click View session at the top of log drawer to inspect in more detail the LLM evaluator’s generation itself.

9

Select the PII check entry in the session trace

In the Completed Prompt tab of the log, you’ll see the full input and output of the LLM evaluator generation.

The LLM evaluator produced an explanation reasoning why the underlying log did contain PII, and terminated with a final verdict of 'False'.

Available variables

In the prompt editor for an LLM evaluator, you have access to the underlying log you are evaluating as well as the testcase that gave rise to it in the case of offline evaluations. These are accessed with the standard {{ variable }} syntax, enhanced with a familiar dot notation to pick out specific values from inside the log and testcase objects. The log and testcase shown in the debug console correspond to the objects available in the context of the LLM evaluator prompt.

For example, suppose you are evaluating a log object like this.

JSON
{
"id": "data_B3RmIu9aA5FibdtXP7CkO",
"model_config": {...},
"inputs": {
"hello": "world",
},
"messages": []
"output": "This is what the AI responded with.",
...etc
}

In the LLM evaluator prompt, if you write {{ log.inputs.hello }} it will be replaced with world in the final prompt sent to the LLM evaluator model.

Note that in order to get access to the fully populated prompt that was sent in the underlying log, you can use {{ log_prompt }}.