Agents

Create and evaluate autonomous AI systems.

Agents are currently in beta and subject to change.

Overview

A Humanloop Agent is a multi-step AI system that leverages an LLM, external information sources and tool calling to accomplish complex tasks automatically. It comprises a main orchestrator LLM that utilizes tools to accomplish its task.

An Agent takes an input and executes multiple steps, using tools to accomplish the task.

An Agent is composed primarily of:

  • A template that instructs the Agent model how to process tasks
  • Model parameters such as temperature, max_tokens, and reasoning_effort
  • The workflow configuration that defines how the Agent processes tasks, such as max_iterations and stopping_tools
  • The available tools the Agent can use, including Tools and Prompts

An Agent executes on the Humanloop platform: for a given Agent version, if you supply the necessary inputs, it will execute until it reaches a stopping condition. Humanloop will return a Log with an output or output_message, along with the full trace of the Agent’s execution.

Inputs are defined through variables in the Agent template. The values of these variables will need to be supplied when you call the Agent to create a generation. The Agent configuration and runtime data will be stored in Logs, which can then be used to create Datasets for evaluation purposes.

Basics

You can create and configure Agents through the Humanloop UI or programmatically via our SDK.

Agents in the UI

To create an Agent, create a new file and select the Agent option.

Create a new Agent file.

Then, use the Agent Editor to configure and test the Agent. You can link existing Tools and Prompts from your workspace for the Agent to use. You can also define a tool inline by providing its JSON schema. You can also configure the max_iterations and stopping_tools for the Agent in the “Parameters” section.

Edit and test the Agent template and configuration in the Editor.

Agents via the SDK

To create an Agent programmatically, use our SDK.

1curl -X POST https://api.humanloop.com/v5/agents \
2 -H "X-API-Key: $YOUR_API_KEY" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "path": "Banking/Teller Agent",
6 "provider": "anthropic",
7 "endpoint": "chat",
8 "model": "claude-3-7-sonnet-latest",
9 "reasoning_effort": 1024,
10 "template": [
11 {
12 "role": "system",
13 "content": "You are a helpful a digital bank teller, you help users navigate our digital banking platform."
14 },
15 ],
16 "max_iterations": 3,
17 "tools": [
18 {
19 "type": "file",
20 "link": {
21 "file_id": "pr_1234567890",
22 "version_id": "prv_1234567890"
23 },
24 "on_agent_call": "continue"
25 },
26 {
27 "type": "json_schema",
28 "json_schema": {
29 "name": "stop",
30 "description": "Call this tool when you have finished your task.",
31 "parameters": {
32 "type": "object",
33 "properties": {
34 "output": {
35 "type": "string",
36 }
37 }
38 }
39 }
40 },
41 ]
42 }'

Agents can be called through our UI for testing and development, or via API for production deployments.

1curl -X POST https://api.humanloop.com/v5/agents/call \
2 -H "X-API-Key: $YOUR_API_KEY" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "path": "Banking/Teller Agent",
6 "messages": [
7 {
8 "role": "user",
9 "content": "I need to withdraw $1000"
10 }
11 ]
12 }'

If your Agent calls a tool that Humanloop cannot run, the system will pause and wait until you provide the response required by the tool. You can then continue the Agent’s execution using the /continue endpoint.

1curl -X POST https://api.humanloop.com/v5/agents/continue \
2 -H "X-API-Key: $YOUR_API_KEY" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "log_id": "log_1234567890",
6 "messages": [
7 {
8 "role": "tool",
9 "tool_call_id": "tool_call_1234567890",
10 "content": "Not enough funds. Politely decline user request."
11 }
12 ]
13 "stream": false
14 }'

You can also log Agent executions explicitly when running the Agent logic yourself:

1curl -X POST https://api.humanloop.com/v5/agents/log \
2 -H "X-API-Key: $YOUR_API_KEY" \
3 -H "Content-Type: application/json" \
4 -d '{
5 "path": "Banking/Teller Agent",
6 "messages": [
7 {
8 "role": "user",
9 "content": "I need to withdraw $1000"
10 },
11 ]
12 }'

Versioning

Versioning your Agents enables you to track how adjustments to template, tools, or model parameters influence the Agent’s behavior. This is crucial for iterative development, as you can pinpoint which configuration produces the most effective results for your use cases. You can use the Editor or SDK to create and update Agent Versions and iterate on improving its performance.

The following components of an Agent contribute to its version:

  • The Agent’s template — the system prompt and any other instructions
  • The model call parameters — such as temperature, max_tokens, and reasoning_effort
  • The Agent’s workflow configuration — such as max_iterations and stopping_tools
  • The Agent’s tools — including both inline JSON Schema tools and linked Tool or Prompt Versions

Stopping Conditions

An Agent will stop executing in three circumstances:

  1. The maximum number of iterations is reached
  2. The Agent executes a tool that is configured to stop the Agent
  3. The Agent has generated a response with no further tool calls

Max Iterations

You can cap the maximum number of iterations an Agent can execute—the Agent will generate up to max_iterations times, and then stop.

This results in an Agent Log with a finish_reason of "max_iterations_reached".

Stopping Tools

You can designate that a tool should stop the Agent after it is called. If it can be run on Humanloop, it will be run and the result logged before the Agent stops.

This results in an Agent Log with a finish_reason of "stopping_tool_called" and stopping_tool_names populated.

Agent Finished

Lastly, the Agent will stop if it has generated a response with no further tool calls. This represents a natural end to the Agent’s execution, and results in an Agent Log with a finish_reason of "agent_generation_finished".

Evaluation

Agents can be evaluated both through the Humanloop UI and programmatically. After each Agent execution, configured evaluators will run automatically on Humanloop.

Get started with evaluating your Agents in the UI or programmatically.

Tracing

A trace is the collection of Logs associated with an Agent execution, showing the complete path from input to output, including all intermediate steps.

Every Agent trace includes:

  • The initial input and messages, and final output
  • All intermediate LLM calls and their responses as nested Logs
  • Tools called by the Agent and their inputs and outputs as nested Logs
  • Any errors or unexpected behaviors
  • Timing and performance metrics
Traces show the individual steps taken by the agent.

Metrics and Timing

Cost, latency, execution time, and token usage metrics are captured for each trace. They span from the earliest interaction to the final response.

Serialization

The .agent file format is a serialized representation of an Agent Version, designed to be human-readable and suitable for integration into version control systems alongside code.

The format is heavily inspired by MDX, with model and parameters specified in a YAML header alongside JSX-style syntax for chat templates.

.agent
1---
2model: claude-3-7-sonnet-latest
3max_tokens: -1
4provider: anthropic
5endpoint: chat
6reasoning_effort: 1024
7tools: [
8 {
9 "type": "file",
10 "link": {
11 "file_id": "pr_1234567890",
12 "version_id": "prv_1234567890"
13 },
14 "on_agent_call": "continue"
15 },
16 {
17 "type": "file",
18 "link": {
19 "file_id": "tl_1234567890",
20 "version_id": "tlv_1234567890"
21 },
22 "on_agent_call": "stop"
23 },
24 {
25 "type": "inline",
26 "json_schema": {
27 "name": "stop",
28 "description": "Call this tool when you have finished your task.",
29 "parameters": {
30 "type": "object",
31 "properties": {
32 "output": {
33 "type": "string",
34 "description": "The final output to return to the user."
35 }
36 },
37 "additionalProperties": false,
38 "required": [
39 "output"
40 ]
41 },
42 "strict": true
43 },
44 "on_agent_call": "stop"
45 },
46 {
47 "type": "file",
48 "link": {
49 "file_id": "tl_cBfvZ3Xre8PAfVAa5r9P6",
50 "version_id": "tlv_gzr3bxaVGu889O4lsOgtG"
51 },
52 "on_agent_call": "continue"
53 },
54]
55---
56<system>
57 You are a helpful a digital bank teller, you help users navigate our digital banking platform.
58
59 In your interactions, follow these guidelines: {{personality}}.
60
61 Notify them about the following policies: {{policies}}.
62</system>

Next steps

You now understand the role of Agents in the Humanloop ecosystem. Explore the following resources to apply Agents to your AI project: