Flows

Instrument and monitor multi-step AI systems.

Introduction

LLM-powered systems are multi-step processes, leveraging information sources, delegating computation to tools, or iterating with the LLM to generate the final answer.

Looking at the inputs and output of such a system is not enough to reason about its behavior. Flows address this by tracing all components of the feature, unifying Logs in a comprehensive view of the system.

Basics

To integrate Flows in your project, add the SDK flow decorator on the entrypoint of your AI feature.

1@humanloop.flow(path="QA Agent/Answer Question")
2def call_agent(question: str) -> str:
3 # A simple question answering agent
4
5 ...
6
7 return answer

The decorator will capture the inputs and output of the agent on Humanloop.

You can then start evaluating the system’s performance through code or through the platform UI.

Tracing

Additional Logs can be added to the trace to provide further insight into the system’s behavior.

On Humanloop, a trace is the collection of Logs associated with a Flow Log.

Question answering agent
1@humanloop.tool(path="QA Agent/Search Wikipedia")
2def search_wikipedia(query: str) -> dict:
3 """LLM function calls this to search Wikipedia."""
4 ...
5
6
7@humanloop.prompt(path="QA Agent/Call Model")
8def call_model(messages: list[dict]) -> dict:
9 """Interact with the LLM model."""
10 ...
11
12
13@humanloop.flow(
14 path="QA Agent/Answer Question",
15 attributes={"version": "v1", "wikipedia": True}
16)
17def call_agent(question: str) -> str:
18 """A simple question answering agent."""
19 ...

The agent makes multiple provider calls to refine the response to the question. It makes function calls to search_wikipedia to retrieve additional information from an external source.

Calling the other functions inside call_agent creates Logs and adds them to the trace created by call_agent.

Trace show the individual steps taken by the agent.

Manual Tracing

If you don’t want to use decorators, first create a Flow Log, then pass its id when creating Logs you want to add to the trace.

Tracing via API
1def call_agent(question: str) -> str:
2 trace_id = humanloop.flows.log(
3 name="QA Agent/Answer Question",
4 flow={
5 "attributes": {
6 "version": "v1",
7 "wikipedia": True
8 }
9 },
10 inputs={"question": question}
11 ).id
12
13 llm_output = humanloop.prompts.call(
14 name="QA Agent/Answer",
15 prompt={...},
16 messages=[...],
17 parent_trace_id=trace_id
18 )
19
20 ...
21
22 humanloop.flows.update_log(
23 log_id=trace_id,
24 output=answer
25 log_status="complete"
26 )

Versioning

Any data you pass into attributes will contribute to the version of the Flow. If you pass in a new value, the version will be updated.

Question answering agent
1@humanloop.flow(
2 path="QA Agent/Answer Question",
3 attributes={"version": "v1", "wikipedia": True}
4)
5def call_agent(question: str) -> str:
6 """A simple question answering agent."""
7 ...

Completing Flow Logs

Flow Logs can be marked as complete in order to prevent further Logs from being added to the trace. The flow decorator will mark a trace as complete when the function returns.

Monitoring Evaluator on Humanloop
1def count_logs_evaluator(log):
2 """Count the number of Logs in a trace."""
3 if log["children"]:
4 # Use the `children` attribute to access all Logs in the trace
5 return 1 + sum([count_logs_evaluator(child) for child in log["children"]])
6 return 1

A Flow Log’s metrics, such as cost, latency and tokens, are computed as Logs are added to the trace.

A Flow Log’s start_time and end_time are computed automatically to span the earliest start and latest end of the Logs in its trace. If start_time and end_time already span the Logs’ timestamps, they are kept as they are.

If you don’t want to use the decorator, you can complete the Flow Log via the SDK directly.

1humanloop.flows.update_log(
2 log_id=trace_id,
3 log_status="complete"
4)

Evaluation

Unlike Prompts, which can be evaluated via the Humanloop UI, you must run evaluations on your Flows through code.

To do this, provide a callable argument to the evaluations.run SDK method.

Unlike other Logs, Evaluators added to Flows can access all Logs inside a trace:

Evaluating a Flow
1humanloop.evaluations.run(
2 name="Comprehensiveness Evaluation",
3 file={
4 "path": "QA Agent/Answer Question",
5 "callable": call_agent,
6 },
7 evaluators=[
8 {"path": "QA Agent/Answer Comprehensiveness"},
9 ],
10 dataset={"path": "QA Agent/Simple Answers"},
11)

Every time a Log is added to a Flow trace, monitoring Evaluators and Evaluations re-evaluate the Log. This behaviour is throttled, so adding multiple Logs in quick succession will result in a single re-evaluation.

Next steps

You now understand the role of Flows in the Humanloop ecosystem. Explore the following resources to apply Flows to your AI project: