Overview
How to develop and manage your Prompt and Tools on Humanloop
Your AI application can be broken down into Prompts, Tools, and Evaluators. Humanloop versions and manages each of these artifacts to enable team collaboration and evaluation of each component of your AI system.
This overview will explain the basics of prompt development, versioning, and management, and how to best integrate your LLM calls with Humanloop.
Prompt Management
Prompts are a fundamental part of interacting with large language models (LLMs). They define the instructions and parameters that guide the model’s responses. In Humanloop, Prompts are managed with version control, allowing you to track changes and improvements over time.
A Prompt on Humanloop encapsulates the instructions and other configuration for how a large language model should perform a specific task. Each change in any of the following properties creates a new version of the Prompt:
- the template such as
Write a song about {{topic}}
. For chat models, your template will contain an array of messages. - the model e.g.
gpt-4o
- all the parameters to the model such as
temperature
,max_tokens
,top_p
etc. - any tools available to the model
Creating a Prompt
You can create a Prompt explicitly in the Prompt Editor or via the API.
New prompts can also be created automatically via the API if you specify the Prompt’s path
(its name and directory) while supplying the Prompt’s parameters and template. This is useful if you are developing your prompts in code and want to be able to version them as you make changes to the code.
Versioning
A Prompt will have multiple versions as you experiment with different models, parameters, or templates. However, all versions should perform the same task and generally be interchangeable with one another.
By versioning your Prompts, you can track how adjustments to the template or parameters influence the LLM’s responses. This is crucial for iterative development, as you can pinpoint which versions produce the most relevant or accurate outputs for your specific use case.
As you edit your prompt, new versions of the Prompt are created automatically. Each version is timestamped and given a unique version ID which is deterministically based on the Prompt’s contents. For every version that you want to “save”, you commit that version and it will be recorded as a new committed version of the Prompt with a commit message.
When to create a new Prompt
You should create a new Prompt for every different ‘task to be done’ with the LLM. For example each of these tasks are things that can be done by an LLM and should be a separate Prompt File: Writing Copilot, Personal Assistant, Summariser, etc.
We’ve seen people find it useful to also create a Prompt called ‘Playground’ where they can free form experiment without concern of breaking anything or making a mess of their other Prompts.
Prompt Engineering
Understanding the best practices for working with large language models can significantly enhance your application’s performance. Each model has its own failure modes, and the methods to address or mitigate these issues are not always straightforward. The field of “prompt engineering” has evolved beyond just crafting prompts to encompass designing systems that incorporate model queries as integral components.
For a start, read our Prompt Engineering 101 guide which covers techniques to improve model reasoning, reduce the chances of model hallucinations, and more.
Prompt templates
Inputs are defined in the template through the double-curly bracket syntax e.g. {{topic}}
and the value of the variable will need to be supplied when you call the Prompt to create a generation.
This separation of concerns, keeping configuration separate from the query time data, is crucial for enabling you to experiment with different configurations and evaluate any changes. The Prompt stores the configuration and the query time data in Logs, which can then be used to create Datasets for evaluation purposes.
Tool Use (Function Calling)
Certain large language models support tool use or “function calling”. For these models, you can supply the description of functions and the model can choose to call one or more of them by providing the values to call the functions with.
Function calling enables the model to perform various tasks:
1. Call external APIs: The model can translate natural language into API calls, allowing it to interact with external services and retrieve information.
2. Take actions: The model can exhibit agentic behavior, making decisions and taking actions based on the given context.
3. Provide structured output: The model’s responses can be constrained to a specific structured format, ensuring consistency and ease of parsing in downstream applications.
Tools for function calling can be defined inline in the Prompt editor in which case they form part of the Prompt version. Alternatively, they can be pulled out in a Tool file which is then referenced in the Prompt.
Each Tool has functional interface that can be supplied as the JSON Schema needed for function calling. Additionally, if the Tool is executable on Humanloop, the result of any tool will automatically be inserted into the response in the API and in the Editor.
Using Prompts
Prompts are callable as an API. You supply and query-time data such as input values or user messages, and the model will respond with its text output.
A Prompt is callable in that if you supply the necessary inputs, it will return a response from the model.
Once you have created and versioned your Prompt, you can call it as an API to generate responses from the large language model directly. You can also fetch the log the data from your LLM calls, enabling you to evaluate and improve your models.
Proxying your LLM calls vs async logging
The easiest way to both call the large language model with your Prompt and to log the data is to use the Prompt.call()
method (see the guide on Calling a Prompt) which will do both in a single API request. However, there are two main reasons why you may wish to log the data seperately from generation:
- You are using your own model that is not natively supported in the Humanloop runtime.
- You wish to avoid relying on Humanloop runtime as the proxied calls adds a small additional latency, or
The prompt.call()
Api encapsulates the LLM provider calls (for example openai.Completions.create()
), the model-config selection and logging steps in a single unified interface. There may be scenarios that you wish to manage the LLM provider calls directly in your own code instead of relying on Humanloop.
Humanloop provides a comprehensive platform for developing, managing, and versioning Prompts, Tools and your other artifacts of you AI systems. This explainer will show you how to create, version and manage your Prompts, Tools and other artifacts.
You can also use Prompts without proxying through Humanloop to the model provider and instead call the model yourself and explicitly log the results to your Prompt.
Serialization (.prompt
file)
Our .prompt
file format is a serialized version of a model config that is designed to be human-readable and suitable for checking into your version control systems alongside your code. See the .prompt files reference reference for more details.
Format
The .prompt file is heavily inspired by MDX, with model and hyperparameters specified in a YAML header alongside a JSX-inspired format for your Chat Template.
Basic examples
Dealing with sensitive data
When working with sensitive data in your AI applications, it’s crucial to handle it securely. Humanloop provides options to help you manage sensitive information while still benefiting from our platform’s features.
If you need to process sensitive data without storing it in Humanloop, you can use the save: false
parameter when making calls to the API or logging data. This ensures that only metadata about the request is stored, while the actual sensitive content is not persisted in our systems.
For PII detection, you can set up Guardrails to detect and prevent the generation of sensitive information.