Evaluations

Create

POST
Create an evaluation.

Path parameters

project_idstringRequired

String ID of project. Starts with pr_.

Request

This endpoint expects an object.
config_id
stringRequired

ID of the config to evaluate. Starts with config_.

evaluator_ids
list of stringsRequired

IDs of evaluators to run on the dataset. IDs start with evfn_

dataset_id
stringRequired

ID of the dataset to use in this evaluation. Starts with evts_.

provider_api_keys
objectOptional
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization. Ensure you provide an API key for the provider for the model config you are evaluating, or have one saved to your organization.
max_concurrency
integerOptionalDefaults to 5
The maximum number of concurrent generations to run. A higher value will result in faster completion of the evaluation but may place higher load on your provider rate-limits.
hl_generated
booleanOptional

Whether the log generations for this evaluation should be performed by Humanloop. If False, the log generations should be submitted by the user via the API.

Response

This endpoint returns an object
id
string

Unique ID for the evaluation. Starts with ev_.

status
enum
Status of an evaluation.
Allowed values: pendingrunningcompletedfailedcancelled
config
union
created_at
datetime
updated_at
datetime
evaluators
list of objects
dataset
object
dataset_version_id
string
dataset_snapshot
objectOptional
evaluator_aggregates
list of objectsOptional
feedback_aggregates
list of unionsOptional

Errors