Upload a Dataset from CSV

In this guide, we will walk through creating a Dataset on Humanloop from a CSV.

Datasets are a collection of input-output pairs that can be used to evaluate your Prompts, Tools or even Evaluators.

Prerequisites

You should have an existing Prompt on Humanloop with a variable defined with our double curly bracket syntax {{variable}}. If not, first follow our guide on creating a Prompt.

In this example, we’ll use a Prompt that categorises user queries about Humanloop’s product and docs by which feature they relate to.

An example Prompt with a variable `{{query}}`.

Steps

To create a dataset from a CSV file, we’ll first create a CSV in Google Sheets that contains values for our Prompt variable {{query}} and then upload it to a Dataset on Humanloop.

1

Create a CSV file.

  • In our Google Sheets example below, we have a column called query which contains possible values for our Prompt variable {{query}}. You can include as many columns as you have variables in your Prompt template.
  • There is additionally a column called target which will populate the target output for the classifier Prompt. In this case, we use simple strings to define the target.
  • More complex Datapoints that contain messages and structured objects for targets are supported, but are harder to incorporate into a CSV file as they tend to be hard-to-read JSON. If you need more complex Datapoints, use the API instead.
A CSV file in Google Sheets defining query and target pairs for our Classifier Prompt.
2

Export the Google Sheet to CSV

In Google Sheets, choose FileDownloadComma-separated values (.csv)

3

Create a new Dataset File

On Humanloop, select New at the bottom of the left-hand sidebar, then select Dataset.

Create a new File from the sidebar on Humanloop.
4

Click Upload CSV

First name your dataset when prompted in the sidebar, then select the Upload CSV button and drag and drop the CSV file you created above using the file explorer. You will then be prompted to provide a commit message to describe the initial state of the dataset.

Uploading a CSV file to create a dataset.
5

Map the CSV columns

Map each of the CSV columns into one of input, message, target. To avoid uploading a column of your CSV you can map it to the exclude option.

To map in columns to Messages, they need to be in a specific format. An example of this can be seen in our example Dataset or below:

"[{""role"": ""user"", ""content"": ""Tell me about the weather""}]"

Once you have mapped your columns, press Extend Current Dataset

Mapping columns of a CSV into specific values of a dataset.
6

Review your uploaded datapoints

You’ll see the input-output pairs that were included in the CSV file and you can review the rows to inspect and edit the individual Datapoints.

Inspect the Dataset created from the CSV file.

Commit the dataset

Click the commit button at the top of the Dataset editor and fill in a commit message. Press Commit again.

Commit a Dataset.

Your dataset is now uploaded and ready for use.

Next steps

🎉 Now that you have Datasets defined in Humanloop, you can leverage our Evaluations feature to systematically measure and improve the performance of your AI applications. See our guides on setting up Evaluators and Running an Evaluation to get started.

For different ways to create datasets, see the links below:

  • Create a Dataset from existing Logs - useful for curating Datasets based on how your AI application has been behaving in the wild.
  • Upload via API - useful for uploading more complex Datasets that may have nested JSON structures, which are difficult to represent in tabular .CSV format, and for integrating with your existing data pipelines.