Flows

Instrument and monitor multi-step AI systems.

Introduction

LLM-powered systems are multi-step processes, leveraging information sources, delegating computation to tools, or iterating with the LLM to generate the final answer.

Looking at the inputs and output of such a system is not enough to reason about its behavior. Flows address this by tracing all components of the feature, unifying Logs in a comprehensive view of the system.

Basics

To integrate Flows in your project, add the SDK flow decorator on the entrypoint of your AI feature.

1@humanloop.flow(path="QA Agent/Answer Question")
2def call_agent(question: str) -> str:
3 # A simple question answering agent
4
5 ...
6
7 return answer

The decorator will capture the inputs and output of the agent on Humanloop.

You can then start evaluating the system’s performance through code or through the platform UI.

Tracing

Additional Logs can be added to the trace to provide further insight into the system’s behavior.

On Humanloop, a trace is the collection of Logs associated with a Flow Log.

Question answering agent
1@humanloop.tool(path="QA Agent/Search Wikipedia")
2def search_wikipedia(query: str) -> dict:
3 """LLM function calls this to search Wikipedia."""
4 ...
5
6
7@humanloop.prompt(path="QA Agent/Call Model")
8def call_model(messages: list[dict]) -> dict:
9 """Interact with the LLM model."""
10 ...
11
12
13@humanloop.flow(
14 path="QA Agent/Answer Question",
15 attributes={"version": "v1", "wikipedia": True}
16)
17def call_agent(question: str) -> str:
18 """A simple question answering agent."""
19 ...

The agent makes multiple provider calls to refine the response to the question. It makes function calls to search_wikipedia to retrieve additional information from an external source.

Calling the other functions inside call_agent creates Logs and adds them to the trace created by call_agent.

Trace show the individual steps taken by the agent.

Manual Tracing

If you don’t want to use decorators, first create a Flow Log, then pass its id when creating Logs you want to add to the trace.

Tracing via API
1def call_agent(question: str) -> str:
2 trace_id = humanloop.flows.log(
3 name="QA Agent/Answer Question",
4 flow={
5 "attributes": {
6 "version": "v1",
7 "wikipedia": True
8 }
9 },
10 inputs={"question": question}
11 ).id
12
13 llm_output = humanloop.prompts.call(
14 name="QA Agent/Answer",
15 prompt={...},
16 messages=[...],
17 parent_trace_id=trace_id
18 )
19
20 ...
21
22 humanloop.flows.update_log(
23 log_id=trace_id,
24 output=answer
25 log_status="complete"
26 )

Versioning

Any data you pass into attributes will contribute to the version of the Flow. If you pass in a new value, the version will be updated.

Question answering agent
1@humanloop.flow(
2 path="QA Agent/Answer Question",
3 attributes={"version": "v1", "wikipedia": True}
4)
5def call_agent(question: str) -> str:
6 """A simple question answering agent."""
7 ...

Completing Flow Logs

Traces must be marked as complete once all relevant Logs have been added. The flow decorator will mark a trace as complete when the function returns.

Monitoring Evaluators only run on Flow Logs once the log is completed. Completing the Flow Log signals its Evaluators that no other Logs will arrive.

Unlike other Logs, Evaluators added to Flows can access all Logs inside a trace:

Monitoring Evaluator on Humanloop
1def count_logs_evaluator(log):
2 """Count the number of Logs in a trace."""
3 if log["children"]:
4 # Use the `children` attribute to access all Logs in the trace
5 return 1 + sum([count_logs_evaluator(child) for child in log["children"]])
6 return 1

A Flow Log’s metrics, such as cost, latency and tokens, are computed once the Log is completed.

A Flow Log’s start_time and end_time are computed automatically to span the earliest start and latest end of the Logs in its trace. If start_time and end_time already span the Logs’ timestamps, they are kept.

Manual Tracing

If you don’t want to use the decorator, you can complete the Flow Log via the SDK directly.

1humanloop.flows.update_log(
2 log_id=trace_id,
3 log_status="complete"
4)

Evaluation

Unlike Prompts, which can be evaluated via the Humanloop UI, you must run evaluations on your Flows through code. After each Flow Log is complete, Evaluators added to the evaluation will still run on Humanloop.

To do this, provide a callable argument to the evaluations.run SDK method.

Evaluating a Flow
1humanloop.evaluations.run(
2 name="Comprehensiveness Evaluation",
3 file={
4 "path": "QA Agent/Answer Question",
5 "callable": call_agent,
6 },
7 evaluators=[
8 {"path": "QA Agent/Answer Comprehensiveness"},
9 ],
10 dataset={"path": "QA Agent/Simple Answers"},
11)

Next steps

You now understand the role of Flows in the Humanloop ecosystem. Explore the following resources to apply Flows to your AI project:

  • Check out our logging quickstart for an example project instrumented with Flows.

  • Dive into the evals guide to learn how to evaluate your AI project.