All 36 runs completed successfully in about six seconds each, yet only 39% of fields matched expected values.
This benchmark tested structured data extraction against a single dataset ('downloaded') containing purchase order information.
All four models delivered similar speed (6.0 to 6.3 seconds) but differed in accuracy. Opus models extracted fields more reliably than Sonnet variants.
Compare model and few-shot configurations below. Each row represents 9 runs with a specific model tested at three different few-shot levels.
| Model | FS | Runs | Avg s | Mismatch | Field match |
|---|
Each model was tested with three few-shot example counts to assess learning from context.
Few-shot variation did not produce dramatic accuracy differences within models; the gap between Opus and Sonnet architectures was more significant than example count.
Based on these results, prioritize the following steps to improve extraction quality while maintaining speed.
{
"agent": "so_extraction",
"pipeline": null,
"models": [
"sonnet-4-6",
"opus-4-5",
"opus-4-6",
"sonnet-4-5"
],
"datasets": [
"downloaded"
],
"chat": null,
"chats_glob": null,
"bulk": false,
"runs_per_chat": 1,
"max_workers": 25,
"few_shot_explicit": [
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/multiple_product_multiple_shipment_complex.json",
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/single_product_multiple_shipment_medium.json",
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/single_product_single_shipment_simple.json"
],
"few_shot_walk": [],
"few_shot_sweep": [],
"few_shot_pool_argv": [],
"few_shot_seed": 42,
"db_few_shot_limit": 0,
"skip_without_expected": false,
"results_dir": "/Users/tripathipranav/Documents/code/harness_agents/results/20260515T033428Z",
"config_file": "configs/agents.json",
"few_shot_mode": "explicit",
"few_shot_pool_size": 68,
"few_shot_default_pool_size": 68,
"few_shot_pool_override": null,
"few_shot_variants": [
{
"label": "explicit",
"count": 3,
"paths": [
"raw_data/chats/multiple_product_multiple_shipment_complex.json",
"raw_data/chats/single_product_multiple_shipment_medium.json",
"raw_data/chats/single_product_single_shipment_simple.json"
]
}
],
"allow_self_fewshot": false
}SUCCESS RATE — Share of runs that finished without a harness or HTTP error. High success means the run was stable; it does not prove the answers matched the reference.
AVG ELAPSED (S) — Average wall time per run in that bucket. Useful for latency comparisons.
AVG MISMATCH / EXPECTED RUN — Average count of fields that differed from the golden JSON when a reference existed. Lower is better.
FIELD MATCH — Fraction of compared fields that matched the golden output across runs in that bucket. Higher is better.
| Agent | Chat | Model | FS | Mismatches | Sample |
|---|---|---|---|---|---|
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-5 | 3 | 19 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-5 | 3 | 17 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 3
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-5 | 3 | 17 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 3
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-6 | 3 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": ""
},
{
"path": "data[0].items[0].quantity",
"expected": 10500.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3100.0,
"actual": null
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": ""
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-5 | 3 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-6 | 3 | 16 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 2
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 6.3
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-6 | 3 | 16 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 2
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 6.3
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | sonnet-4-6 | 3 | 16 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-6 | 3 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-5 | 3 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-5 | 3 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-6 | 3 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | sonnet-4-5 | 3 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | opus-4-6 | 3 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | opus-4-5 | 3 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-6 | 3 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-6 | 3 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-6 | 3 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "EXW"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-5 | 3 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-5 | 3 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-6 | 3 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-6 | 3 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-5 | 3 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-5 | 3 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-6 | 3 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-5 | 3 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-5 | 3 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-6 | 3 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-6 | 3 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-6 | 3 | 12 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-5 | 3 | 12 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-5 | 3 | 10 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-6 | 3 | 10 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-5 | 3 | 1 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 0
}
] |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-6 | 3 | 1 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 0
}
] |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-5 | 3 | 1 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 0
}
] |