Get eval run output items

GEThttps:/api.openai.com/v1/evals/{eval_id}/runs/{run_id}/output_items

Get a list of output items for an evaluation run.

Path parameters

  • eval_id
    string
    Required
    The ID of the evaluation to retrieve runs for.
  • run_id
    string
    Required
    The ID of the run to retrieve output items for.

Query parameters

  • after
    string
    Identifier for the last output item from the previous pagination request.
  • limit
    integer
    Defaults: 20
    Number of output items to retrieve.
  • status
    string

    Filter output items by status. Use failed to filter by failed output items or pass to filter by passed output items.

  • order
    string
    Defaults: asc

    Sort order for output items by timestamp. Use asc for ascending order or desc for descending order. Defaults to asc.

Response

A list of EvalRunOutputItem objects matching the specified ID.

Example request
1
curl https://api.openai.com/v1/evals/egroup_67abd54d9b0081909a86353f6fb9317a/runs/erun_67abd54d60ec8190832b46859da808f7/output_items \
2
-H "Authorization: Bearer $OPENAI_API_KEY" \
3
-H "Content-Type: application/json"
Example response
1
{
2
"object": "list",
3
"data": [
4
{
5
"object": "eval.run.output_item",
6
"id": "outputitem_67e5796c28e081909917bf79f6e6214d",
7
"created_at": 1743092076,
8
"run_id": "evalrun_67abd54d60ec8190832b46859da808f7",
9
"eval_id": "eval_67abd54d9b0081909a86353f6fb9317a",
10
"status": "pass",
11
"datasource_item_id": 5,
12
"datasource_item": {
13
"input": "Stock Markets Rally After Positive Economic Data Released",
14
"ground_truth": "Markets"
15
},
16
"results": [
17
{
18
"name": "String check-a2486074-d803-4445-b431-ad2262e85d47",
19
"sample": null,
20
"passed": true,
21
"score": 1.0
22
}
23
],
24
"sample": {
25
"input": [
26
{
27
"role": "developer",
28
"content": "Categorize a given news headline into one of the following topics: Technology, Markets, World, Business, or Sports.\n\n# Steps\n\n1. Analyze the content of the news headline to understand its primary focus.\n2. Extract the subject matter, identifying any key indicators or keywords.\n3. Use the identified indicators to determine the most suitable category out of the five options: Technology, Markets, World, Business, or Sports.\n4. Ensure only one category is selected per headline.\n\n# Output Format\n\nRespond with the chosen category as a single word. For instance: \"Technology\", \"Markets\", \"World\", \"Business\", or \"Sports\".\n\n# Examples\n\n**Input**: \"Apple Unveils New iPhone Model, Featuring Advanced AI Features\" \n**Output**: \"Technology\"\n\n**Input**: \"Global Stocks Mixed as Investors Await Central Bank Decisions\" \n**Output**: \"Markets\"\n\n**Input**: \"War in Ukraine: Latest Updates on Negotiation Status\" \n**Output**: \"World\"\n\n**Input**: \"Microsoft in Talks to Acquire Gaming Company for $2 Billion\" \n**Output**: \"Business\"\n\n**Input**: \"Manchester United Secures Win in Premier League Football Match\" \n**Output**: \"Sports\" \n\n# Notes\n\n- If the headline appears to fit into more than one category, choose the most dominant theme.\n- Keywords or phrases such as \"stocks\", \"company acquisition\", \"match\", or technological brands can be good indicators for classification.\n",
29
"tool_call_id": null,
30
"tool_calls": null,
31
"function_call": null
32
},
33
{
34
"role": "user",
35
"content": "Stock Markets Rally After Positive Economic Data Released",
36
"tool_call_id": null,
37
"tool_calls": null,
38
"function_call": null
39
}
40
],
41
"output": [
42
{
43
"role": "assistant",
44
"content": "Markets",
45
"tool_call_id": null,
46
"tool_calls": null,
47
"function_call": null
48
}
49
],
50
"finish_reason": "stop",
51
"model": "gpt-4o-mini-2024-07-18",
52
"usage": {
53
"total_tokens": 325,
54
"completion_tokens": 2,
55
"prompt_tokens": 323,
56
"cached_tokens": 0
57
},
58
"error": null,
59
"temperature": 1.0,
60
"max_completion_tokens": 2048,
61
"top_p": 1.0,
62
"seed": 42
63
}
64
}
65
],
66
"first_id": "outputitem_67e5796c28e081909917bf79f6e6214d",
67
"last_id": "outputitem_67e5796c28e081909917bf79f6e6214d",
68
"has_more": true
69
}
Built with