Call a Prompt.
Calling a Prompt calls the model provider before logging the request, responses and metadata to Humanloop.
You can use query parameters version_id
, or environment
, to target
an existing version of the Prompt. Otherwise the default deployed version will be chosen.
Instead of targeting an existing version explicitly, you can instead pass in Prompt details in the request body. In this case, we will check if the details correspond to an existing version of the Prompt. If they do not, we will create a new version. This is helpful in the case where you are storing or deriving your Prompt details in code.
A specific Version ID of the Prompt to log to.
Name of the Environment identifying a deployed version to log to.
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
Path of the Prompt, including the name. This locates the Prompt in the Humanloop filesystem and is used as as a unique identifier. Example: folder/name
or just name
.
ID for an existing Prompt.
The messages passed to the to provider chat endpoint.
Controls how the model uses tools. The following options are supported:
'none'
means the model will not call any tool and instead generates a message; this is the default when no tools are provided as part of the Prompt.'auto'
means the model can decide to call one or more of the provided tools; this is the default when tools are provided as part of the Prompt.'required'
means the model can decide to call one or more of the provided tools.{'type': 'function', 'function': {name': <TOOL_NAME>}}
forces the model to use the named function.Details of your Prompt. A new Prompt version will be created if the provided details are new.
The inputs passed to the prompt template.
Identifies where the model was called from.
Any additional metadata to record.
When the logged event started.
When the logged event ended.
Unique identifier for the Datapoint that this Log is derived from. This can be used by Humanloop to associate Logs to Evaluations. If provided, Humanloop will automatically associate this Log to Evaluations that require a Log for this Datapoint-Version pair.
Identifier of the Flow Log to which the Log will be associated. Multiple Logs can be associated by passing the same trace_id in subsequent log requests. Use the Flow File log endpoint to create the Trace first.
Log under which this Log should be nested. Leave field blank if the Log should be nested directly under root Trace Log. Parent Log should already be added to the Trace.
Array of Batch Ids that this log is part of. Batches are used to group Logs together for offline Evaluations
End-user ID related to the Log.
The name of the Environment the Log is associated to.
Whether the request/response payloads will be stored on Humanloop.
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
The number of generations.
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
Include the log probabilities of the top n tokens in the provider_response
The suffix that comes after a completion of inserted text. Useful for completions that act like inserts.
The index of the sample in the batch.
ID of the log.
ID of the Prompt the log belongs to.
ID of the specific version of the Prompt.
Generated output from your model for the provided inputs. Can be None
if logging an error, or if creating a parent Log with the intention to populate it later.
User defined timestamp for when the log was created.
Error message if the log is an error.
Duration of the logged event in seconds.
Captured log and debug statements.
The message returned by the provider.
Number of tokens in the prompt used to generate the output.
Number of tokens in the output generated by the model.
Cost in dollars associated to the tokens in the prompt.
Cost in dollars associated to the tokens in the output.
Reason the generation finished.