All 108 runs succeeded across six models, with the fastest finishing in under 5 seconds and the best reaching 92% field match.
This benchmark evaluated extraction quality on a single dataset with 18 items per model configuration.
The chart below maps each model's accuracy against its average runtime, revealing clear tradeoffs between speed and precision.
The sortable table shows all six models tested, with four few-shot configurations each (18 runs per model).
| Model | FS | Runs | Avg s | Mismatch | Field match |
|---|
All models were tested with four different few-shot prompt configurations to assess consistency.
Field match rates stayed within 2–4 percentage points across few-shot variations for most models, showing stable prompt engineering.
Based on these results, here are the highest-priority actions for stakeholders:
{
"agent": "so_extraction",
"pipeline": null,
"models": [
"sonnet-4-5",
"sonnet-4-6",
"opus-4-6",
"openai:4.1",
"openai:5.4",
"gemini:gemini-2.5-pro"
],
"datasets": [
"downloaded"
],
"chat": null,
"chats_glob": null,
"bulk": false,
"runs_per_chat": 2,
"max_workers": 25,
"few_shot_explicit": [
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/multiple_product_multiple_shipment_medium.json",
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/single_product_multiple_shipment_complex.json",
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/single_product_single_shipment_medium.json",
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/updates/update_change_payment_terms.json"
],
"few_shot_walk": [],
"few_shot_sweep": [],
"few_shot_pool_argv": [],
"few_shot_seed": 42,
"db_few_shot_limit": 0,
"skip_without_expected": true,
"results_dir": "/Users/tripathipranav/Documents/code/harness_agents/results/20260518T193425Z",
"config_file": "configs/agents.json",
"few_shot_mode": "explicit",
"few_shot_pool_size": 68,
"few_shot_default_pool_size": 68,
"few_shot_pool_override": null,
"few_shot_variants": [
{
"label": "explicit",
"count": 4,
"paths": [
"raw_data/chats/multiple_product_multiple_shipment_medium.json",
"raw_data/chats/single_product_multiple_shipment_complex.json",
"raw_data/chats/single_product_single_shipment_medium.json",
"raw_data/chats/updates/update_change_payment_terms.json"
]
}
],
"allow_self_fewshot": false
}SUCCESS RATE — Share of runs that finished without a harness or HTTP error. High success means the run was stable; it does not prove the answers matched the reference.
AVG ELAPSED (S) — Average wall time per run in that bucket. Useful for latency comparisons.
AVG MISMATCH / EXPECTED RUN — Average count of fields that differed from the golden JSON when a reference existed. Lower is better.
FIELD MATCH — Fraction of compared fields that matched the golden output across runs in that bucket. Higher is better.
| Agent | Chat | Model | FS | Mismatches | Sample |
|---|---|---|---|---|---|
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5.4 | 4 | 12 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3.05,
"actual": null
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/KG",
"actual": ""
},
{
"path": "data[0].items[0].ship_term",
"expected": "EXW",
"actual": ""
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "EXW",
"actual": ""
},
{
"path": "data[0].items[0].shipment_date",
"expected": "2026-03-31",
"actual": ""
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.4 | 4 | 7 | [
{
"path": "data[0].items[0].delivery_terms",
"expected": "EXW",
"actual": ""
},
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].items[0].total",
"expected": 5850.0,
"actual": null
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].po_ref_no",
"expected": "",
"actual": "OPO 260012/EC"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5.4 | 4 | 7 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2026-03-31",
"actual": ""
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Japan",
"actual": ""
},
{
"path": "data[0].items[0].packing",
"expected": "25kg bags in carton",
"actual": ""
},
{
"path": "data[0].do_date",
"expected": "2026-03-31",
"actual": ""
},
{
"path": "data[0].po_ref_no",
"expected": "4520000944",
"actual": ""
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-6 | 4 | 7 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
"actual": ""
},
{
"path": "data[0].do_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].po_date",
"expected": "2025-09-29",
"actual": ""
},
{
"path": "data[0].po_ref_no",
"expected": "BP102-2025-1",
"actual": ""
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-6 | 4 | 7 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
"actual": ""
},
{
"path": "data[0].do_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].po_date",
"expected": "2025-09-29",
"actual": ""
},
{
"path": "data[0].po_ref_no",
"expected": "BP102-2025-1",
"actual": ""
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-6 | 4 | 7 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
"actual": ""
},
{
"path": "data[0].do_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].po_date",
"expected": "2025-09-29",
"actual": ""
},
{
"path": "data[0].po_ref_no",
"expected": "BP102-2025-1",
"actual": ""
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:5.4 | 4 | 6 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD/KG"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Jakarta",
"actual": ""
},
{
"path": "data[0].items[0].total",
"expected": 29250.0,
"actual": null
},
{
"path": "data[0].po_ref_no",
"expected": "PO-IMP-BIB-2601-017",
"actual": ""
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:5.4 | 4 | 6 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD/KG"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Jakarta",
"actual": ""
},
{
"path": "data[0].items[0].total",
"expected": 29250.0,
"actual": null
},
{
"path": "data[0].po_ref_no",
"expected": "PO-IMP-BIB-2601-017",
"actual": ""
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-pro | 4 | 6 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
"actual": "Busan"
},
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
},
{
"path": "data[0].delivery_terms",
"expected": "CIF Busan",
"actual": "CIF"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.4 | 4 | 6 | [
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD"
},
{
"path": "data[0].items[0].packing",
"expected": "25kg printed paper bag",
"actual": "25kg printed paper bags"
},
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.4 | 4 | 5 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].items[0].total",
"expected": 5850.0,
"actual": null
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].po_ref_no",
"expected": "",
"actual": "OPO 260012/EC"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": ""
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | gemini:gemini-2.5-pro | 4 | 5 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].po_ref_no",
"expected": "",
"actual": "OPO 260012/EC"
},
{
"path": "data[0].billing_address",
"expected": "",
"actual": "Leonardo da Vinci"
},
{
"path": "data[0].shipping_method",
"expected": "Collection Against OPO 260012/EC",
"actual": "Collection"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-5 | 4 | 5 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].do_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].po_date",
"expected": "2025-09-29",
"actual": ""
},
{
"path": "data[0].po_ref_no",
"expected": "BP102-2025-1",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "FeedBEST Company Limited, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
"actual": "Leonardo da Vinci, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-5 | 4 | 5 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].do_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].po_date",
"expected": "2025-09-29",
"actual": ""
},
{
"path": "data[0].po_ref_no",
"expected": "BP102-2025-1",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "FeedBEST Company Limited, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
"actual": "Leonardo da Vinci, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-5 | 4 | 5 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].do_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].po_date",
"expected": "2025-09-29",
"actual": ""
},
{
"path": "data[0].po_ref_no",
"expected": "BP102-2025-1",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "FeedBEST Company Limited, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
"actual": "Leonardo da Vinci"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-5 | 4 | 5 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].do_date",
"expected": "2025-11-15",
"actual": "2026-11-15"
},
{
"path": "data[0].po_date",
"expected": "2025-09-29",
"actual": ""
},
{
"path": "data[0].po_ref_no",
"expected": "BP102-2025-1",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "FeedBEST Company Limited, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
"actual": "Leonardo da Vinci, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-6 | 4 | 4 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3.25,
"actual": 3250.0
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/KG",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-01"
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-01"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-6 | 4 | 4 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3.25,
"actual": 3250.0
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/KG",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-01"
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-01"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-6 | 4 | 4 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2026-03-31",
"actual": "2026-03-15"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Japan",
"actual": ""
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 32025.0
},
{
"path": "data[0].do_date",
"expected": "2026-03-31",
"actual": "2026-03-15"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-6 | 4 | 4 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2026-03-31",
"actual": "2026-03-15"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Japan",
"actual": ""
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 32025.0
},
{
"path": "data[0].do_date",
"expected": "2026-03-31",
"actual": "2026-03-15"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:4.1 | 4 | 4 | [
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD"
},
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.4 | 4 | 4 | [
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD"
},
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.4 | 4 | 4 | [
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD"
},
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-pro | 4 | 4 | [
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
},
{
"path": "data[0].delivery_terms",
"expected": "CIF Busan",
"actual": "CIF"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:4.1 | 4 | 4 | [
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD"
},
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-6 | 4 | 4 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
"actual": ""
},
{
"path": "data[0].po_ref_no",
"expected": "BP102-2025-1",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "FeedBEST Company Limited, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
"actual": ""
},
{
"path": "data[0].shipping_address",
"expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
"actual": ""
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.4 | 4 | 4 | [
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD"
},
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-pro | 4 | 4 | [
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
},
{
"path": "data[0].shipping_method",
"expected": "",
"actual": "Unknown"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-5 | 4 | 3 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].billing_address",
"expected": "",
"actual": "Leonardo da Vinci"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-5 | 4 | 3 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].billing_address",
"expected": "",
"actual": "Leonardo da Vinci"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:4.1 | 4 | 3 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].shipping_method",
"expected": "Collection Against OPO 260012/EC",
"actual": ""
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:4.1 | 4 | 3 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": ""
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.4 | 4 | 3 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": ""
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.4 | 4 | 3 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": ""
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | gemini:gemini-2.5-pro | 4 | 3 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
},
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
},
{
"path": "data[0].shipping_address",
"expected": "",
"actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:4.1 | 4 | 3 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-5 | 4 | 3 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Japan",
"actual": ""
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 32025.0
},
{
"path": "data[0].billing_address",
"expected": "",
"actual": "Leonardo da Vinci"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-6 | 4 | 3 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3.05,
"actual": 3050.0
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/KG",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 32025.0
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-5 | 4 | 3 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Japan",
"actual": ""
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 32025.0
},
{
"path": "data[0].billing_address",
"expected": "",
"actual": "Leonardo da Vinci"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | sonnet-4-5 | 4 | 3 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD/KG"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Jakarta",
"actual": ""
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | sonnet-4-5 | 4 | 3 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD/KG"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Jakarta",
"actual": ""
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | opus-4-6 | 4 | 3 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD/KG"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Jakarta",
"actual": ""
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | opus-4-6 | 4 | 3 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD/KG"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Jakarta",
"actual": ""
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:4.1 | 4 | 3 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD/KG"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Jakarta",
"actual": ""
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:4.1 | 4 | 3 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD/KG"
},
{
"path": "data[0].items[0].shipping_address",
"expected": "Jakarta",
"actual": ""
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-5 | 4 | 3 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].items[0].loading",
"expected": "",
"actual": "13MT/20'FCL"
},
{
"path": "data[0].billing_address",
"expected": "",
"actual": "Leonardo da Vinci, "
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:4.1 | 4 | 3 | [
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-6 | 4 | 3 | [
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40' FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-6 | 4 | 3 | [
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40' FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:4.1 | 4 | 3 | [
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-6 | 4 | 3 | [
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40' FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-6 | 4 | 3 | [
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40' FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].payment_date",
"expected": "",
"actual": "Net 14 Days"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-pro | 4 | 3 | [
{
"path": "data[0].items[0].loading",
"expected": "23MT/40'FCL",
"actual": "23 MT / 40’ FCL"
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": "AG Lipids Pte Ltd"
},
{
"path": "data[0].delivery_terms",
"expected": "CIF Busan",
"actual": "CIF"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-6 | 4 | 2 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-6 | 4 | 2 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | gemini:gemini-2.5-pro | 4 | 2 | [
{
"path": "data[0].items[0].shipment_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
},
{
"path": "data[0].do_date",
"expected": "2027-03-01",
"actual": "2026-03-31"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-5 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": "Leonardo da Vinci"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-5 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": "Leonardo da Vinci"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-6 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-6 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-6 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-6 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:4.1 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | gemini:gemini-2.5-pro | 4 | 2 | [
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 32025.0
},
{
"path": "data[0].shipping_address",
"expected": "",
"actual": "Japan"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:4.1 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Japan",
"actual": ""
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 32025.0
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:4.1 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Japan",
"actual": ""
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 32025.0
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | gemini:gemini-2.5-pro | 4 | 2 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD/KG"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | gemini:gemini-2.5-pro | 4 | 2 | [
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/MT",
"actual": "USD/KG"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-5 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].items[0].loading",
"expected": "",
"actual": "13MT/20'FCL"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:4.1 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].delivery_terms",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:4.1 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].delivery_terms",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.4 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": ""
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.4 | 4 | 2 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": ""
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.4 | 4 | 2 | [
{
"path": "data[0].items[0].total",
"expected": 6300.0,
"actual": null
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": ""
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.4 | 4 | 2 | [
{
"path": "data[0].items[0].total",
"expected": 6300.0,
"actual": null
},
{
"path": "data[0].vendor_name",
"expected": "Van Beethoven",
"actual": ""
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | gemini:gemini-2.5-pro | 4 | 1 | [
{
"path": "data[0].billing_address",
"expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
"actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | gemini:gemini-2.5-pro | 4 | 1 | [
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 32025.0
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-6 | 4 | 1 | [
{
"path": "data[0].items[0].loading",
"expected": "",
"actual": "13MT/20'FCL"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-6 | 4 | 1 | [
{
"path": "data[0].items[0].loading",
"expected": "",
"actual": "13MT/20'FCL"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-6 | 4 | 1 | [
{
"path": "data[0].items[0].shipping_address",
"expected": "Nhava Sheva",
"actual": ""
}
] |