This benchmark evaluated the `so_extraction` agent with 10 different models in a zero-shot configuration. While all combinations ran successfully without errors, there was a wide disparity in the quality of the extracted data. The overall field match rate was 81%, with an average of 4.5 mismatched fields per run, indicating room for improvement. The high standard deviation of mismatch (5.13) highlights inconsistent performance across the board. The `opus` and `sonnet` models demonstrated significantly better extraction quality than the `gemini` and `openai` models on this task.
Success Rate: The percentage of runs where the agent and model completed without an error. In this brief, all combos achieved a 100% success rate. Avg runtime: The average time in seconds the agent took to complete a run. Avg mismatch/expected run: For runs with a 'golden' expected output, this is the average number of fields whose extracted values did not match. Lower is better. Field match rate: The percentage of total fields that correctly matched the expected output across all runs for that combination. Higher is better. Mismatch stdev: The standard deviation of the mismatch count across all individual runs. A high value suggests inconsistent quality, with some runs having few errors and others having many.
The `so_extraction` agent's performance varied significantly depending on the model used, with all runs being zero-shot. The `sonnet` and `opus` model families were the clear leaders in accuracy. `sonnet-4-6` achieved the highest field match rate (87.6%), closely followed by `opus-4-6` (87.4%), which had the lowest average mismatch count (2.92). In contrast, the Gemini models were the weakest performers; `gemini:gemini-2.5-flash` had the lowest field match rate (75.3%) and the highest average mismatch count (6.08). In terms of speed, `openai:4.1` was the fastest by a wide margin at 1.8 seconds, while `openai:5-mini` was the slowest at nearly 21 seconds.
{
"agent": "so_extraction",
"pipeline": null,
"models": [
"sonnet-4-6",
"sonnet-4-5",
"opus-4-5",
"opus-4-6",
"openai:4.1",
"openai:5.2",
"openai:5-mini",
"openai:5.4",
"gemini:gemini-2.5-pro",
"gemini:gemini-2.5-flash"
],
"datasets": [
"downloaded",
"acme_foods",
"nova_exports"
],
"chat": null,
"chats_glob": null,
"bulk": false,
"runs_per_chat": 1,
"max_workers": 25,
"few_shot_explicit": [],
"few_shot_sweep": [],
"few_shot_pool_argv": [
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/multiple_product_multiple_shipment_medium.json",
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/single_product_multiple_shipment_medium.json",
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/single_product_single_shipment_medium.json",
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/updates/update_change_quantity.json",
"/Users/tripathipranav/Documents/code/harness_agents/raw_data/downloaded_chats/03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json"
],
"few_shot_seed": 42,
"db_few_shot_limit": 0,
"skip_without_expected": true,
"results_dir": "/Users/tripathipranav/Documents/code/harness_agents/results/20260512T184755Z",
"config_file": "configs/agents.json",
"few_shot_pool_size": 5,
"few_shot_default_pool_size": 68,
"few_shot_pool_override": [
"raw_data/chats/multiple_product_multiple_shipment_medium.json",
"raw_data/chats/single_product_multiple_shipment_medium.json",
"raw_data/chats/single_product_single_shipment_medium.json",
"raw_data/chats/updates/update_change_quantity.json",
"raw_data/downloaded_chats/03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json"
],
"few_shot_variants": [
{
"label": "none",
"count": 0,
"paths": []
}
],
"allow_self_fewshot": false
}
| Agent | Runs | Success | Avg attempts | Avg elapsed (s) | Avg mismatch/expected | Field match |
|---|---|---|---|---|---|---|
| so_extraction | 510 | 1.0000 | 1.0000 | 8.0545 | 4.5157 | 0.8101 |
| Agent | Model | FS count | Runs | Success | Avg attempts | Avg elapsed (s) | Avg mismatch/expected | Field match |
|---|---|---|---|---|---|---|---|---|
| so_extraction | gemini:gemini-2.5-flash | 0 | 51 | 1.0000 | 1.0000 | 13.2774 | 6.0784 | 0.7526 |
| so_extraction | gemini:gemini-2.5-pro | 0 | 51 | 1.0000 | 1.0000 | 19.6445 | 5.9216 | 0.7590 |
| so_extraction | openai:4.1 | 0 | 51 | 1.0000 | 1.0000 | 1.8013 | 4.1569 | 0.8308 |
| so_extraction | openai:5-mini | 0 | 51 | 1.0000 | 1.0000 | 20.9493 | 5.6275 | 0.7667 |
| so_extraction | openai:5.2 | 0 | 51 | 1.0000 | 1.0000 | 3.2435 | 5.5294 | 0.7749 |
| so_extraction | openai:5.4 | 0 | 51 | 1.0000 | 1.0000 | 2.4232 | 5.3529 | 0.7798 |
| so_extraction | opus-4-5 | 0 | 51 | 1.0000 | 1.0000 | 5.0071 | 2.9412 | 0.8708 |
| so_extraction | opus-4-6 | 0 | 51 | 1.0000 | 1.0000 | 5.0857 | 2.9216 | 0.8742 |
| so_extraction | sonnet-4-5 | 0 | 51 | 1.0000 | 1.0000 | 4.6297 | 3.5882 | 0.8250 |
| so_extraction | sonnet-4-6 | 0 | 51 | 1.0000 | 1.0000 | 4.4832 | 3.0392 | 0.8763 |
| Agent | FS count | Runs | Success | Avg mismatch/expected | Field match |
|---|---|---|---|---|---|
| so_extraction | 0 | 510 | 1.0000 | 4.5157 | 0.8101 |
| Agent | Dataset | Runs | Success | Avg elapsed (s) | Avg mismatch/expected | Field match |
|---|---|---|---|---|---|---|
| so_extraction | acme_foods | 210 | 1.0000 | 8.1793 | 2.4571 | 0.9014 |
| so_extraction | downloaded | 90 | 1.0000 | 9.0965 | 14.4444 | 0.3783 |
| so_extraction | nova_exports | 210 | 1.0000 | 7.4831 | 2.3190 | 0.8986 |
| Agent | Chat | Model | FS count | Runs | Success | Avg elapsed (s) | Avg mismatch/expected |
|---|---|---|---|---|---|---|---|
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 19.6091 | 15.0000 |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 20.2502 | 15.0000 |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:4.1 | 0 | 1 | 1.0000 | 3.7683 | 17.0000 |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5-mini | 0 | 1 | 1.0000 | 31.6417 | 15.0000 |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.2 | 0 | 1 | 1.0000 | 5.2731 | 15.0000 |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.4 | 0 | 1 | 1.0000 | 4.6332 | 16.0000 |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-5 | 0 | 1 | 1.0000 | 6.7855 | 15.0000 |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-6 | 0 | 1 | 1.0000 | 6.6117 | 15.0000 |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-5 | 0 | 1 | 1.0000 | 7.3822 | 15.0000 |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-6 | 0 | 1 | 1.0000 | 6.0964 | 15.0000 |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 13.4846 | 15.0000 |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 14.0097 | 15.0000 |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:4.1 | 0 | 1 | 1.0000 | 7.5147 | 14.0000 |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5-mini | 0 | 1 | 1.0000 | 18.1884 | 13.0000 |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.2 | 0 | 1 | 1.0000 | 7.7245 | 14.0000 |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.4 | 0 | 1 | 1.0000 | 5.0573 | 14.0000 |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-5 | 0 | 1 | 1.0000 | 6.5817 | 14.0000 |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-6 | 0 | 1 | 1.0000 | 6.9395 | 14.0000 |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-5 | 0 | 1 | 1.0000 | 6.2408 | 14.0000 |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-6 | 0 | 1 | 1.0000 | 6.5215 | 15.0000 |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 20.1109 | 20.0000 |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 32.2827 | 20.0000 |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:4.1 | 0 | 1 | 1.0000 | 4.4666 | 17.0000 |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5-mini | 0 | 1 | 1.0000 | 23.4266 | 18.0000 |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5.2 | 0 | 1 | 1.0000 | 4.0920 | 16.0000 |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5.4 | 0 | 1 | 1.0000 | 5.5559 | 17.0000 |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-5 | 0 | 1 | 1.0000 | 8.9394 | 16.0000 |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-6 | 0 | 1 | 1.0000 | 8.7251 | 16.0000 |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-5 | 0 | 1 | 1.0000 | 10.8801 | 17.0000 |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-6 | 0 | 1 | 1.0000 | 7.9129 | 16.0000 |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 14.6141 | 17.0000 |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 11.7880 | 16.0000 |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:4.1 | 0 | 1 | 1.0000 | 1.7635 | 15.0000 |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:5-mini | 0 | 1 | 1.0000 | 18.0369 | 17.0000 |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:5.2 | 0 | 1 | 1.0000 | 3.5854 | 16.0000 |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:5.4 | 0 | 1 | 1.0000 | 3.9828 | 16.0000 |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | opus-4-5 | 0 | 1 | 1.0000 | 4.9230 | 15.0000 |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | opus-4-6 | 0 | 1 | 1.0000 | 5.0254 | 15.0000 |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.9733 | 16.0000 |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3553 | 16.0000 |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 10.3316 | 15.0000 |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 17.9630 | 15.0000 |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:4.1 | 0 | 1 | 1.0000 | 1.8150 | 13.0000 |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5-mini | 0 | 1 | 1.0000 | 13.4473 | 13.0000 |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.2 | 0 | 1 | 1.0000 | 2.3782 | 14.0000 |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.4 | 0 | 1 | 1.0000 | 2.5115 | 14.0000 |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-5 | 0 | 1 | 1.0000 | 5.4960 | 14.0000 |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-6 | 0 | 1 | 1.0000 | 5.0261 | 13.0000 |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.0597 | 13.0000 |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3351 | 14.0000 |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 8.3897 | 14.0000 |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 17.8292 | 14.0000 |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:4.1 | 0 | 1 | 1.0000 | 1.6348 | 16.0000 |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5-mini | 0 | 1 | 1.0000 | 17.9023 | 15.0000 |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.2 | 0 | 1 | 1.0000 | 2.4209 | 14.0000 |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.4 | 0 | 1 | 1.0000 | 2.8629 | 15.0000 |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-5 | 0 | 1 | 1.0000 | 5.6257 | 14.0000 |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-6 | 0 | 1 | 1.0000 | 5.3624 | 14.0000 |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.5365 | 14.0000 |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.7131 | 13.0000 |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 10.9964 | 17.0000 |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 16.9122 | 17.0000 |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:4.1 | 0 | 1 | 1.0000 | 1.1180 | 17.0000 |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5-mini | 0 | 1 | 1.0000 | 11.0887 | 17.0000 |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5.2 | 0 | 1 | 1.0000 | 2.1140 | 17.0000 |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5.4 | 0 | 1 | 1.0000 | 2.7897 | 17.0000 |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-5 | 0 | 1 | 1.0000 | 3.0860 | 1.0000 |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-6 | 0 | 1 | 1.0000 | 2.9306 | 1.0000 |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-5 | 0 | 1 | 1.0000 | 2.8644 | 1.0000 |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-6 | 0 | 1 | 1.0000 | 3.8834 | 17.0000 |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 27.1891 | 16.0000 |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 22.9985 | 13.0000 |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:4.1 | 0 | 1 | 1.0000 | 1.7815 | 14.0000 |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5-mini | 0 | 1 | 1.0000 | 26.4686 | 12.0000 |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.2 | 0 | 1 | 1.0000 | 2.8005 | 14.0000 |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.4 | 0 | 1 | 1.0000 | 3.5191 | 15.0000 |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-5 | 0 | 1 | 1.0000 | 4.9217 | 14.0000 |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-6 | 0 | 1 | 1.0000 | 5.7270 | 12.0000 |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.3733 | 19.0000 |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.4543 | 18.0000 |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 32.3192 | 13.0000 |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 23.5488 | 11.0000 |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:4.1 | 0 | 1 | 1.0000 | 1.7096 | 10.0000 |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5-mini | 0 | 1 | 1.0000 | 27.6829 | 9.0000 |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.2 | 0 | 1 | 1.0000 | 2.4864 | 13.0000 |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.4 | 0 | 1 | 1.0000 | 2.9683 | 14.0000 |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-5 | 0 | 1 | 1.0000 | 5.5241 | 10.0000 |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-6 | 0 | 1 | 1.0000 | 6.1518 | 10.0000 |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.5088 | 17.0000 |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3714 | 16.0000 |
| so_extraction | fs_acme_simple.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 7.9266 | 5.0000 |
| so_extraction | fs_acme_simple.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 16.0102 | 1.0000 |
| so_extraction | fs_acme_simple.json | openai:4.1 | 0 | 1 | 1.0000 | 1.9005 | 2.0000 |
| so_extraction | fs_acme_simple.json | openai:5-mini | 0 | 1 | 1.0000 | 22.0983 | 8.0000 |
| so_extraction | fs_acme_simple.json | openai:5.2 | 0 | 1 | 1.0000 | 2.5627 | 2.0000 |
| so_extraction | fs_acme_simple.json | openai:5.4 | 0 | 1 | 1.0000 | 2.5284 | 3.0000 |
| so_extraction | fs_acme_simple.json | opus-4-5 | 0 | 1 | 1.0000 | 5.3027 | 2.0000 |
| so_extraction | fs_acme_simple.json | opus-4-6 | 0 | 1 | 1.0000 | 5.0720 | 2.0000 |
| so_extraction | fs_acme_simple.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.0457 | 1.0000 |
| so_extraction | fs_acme_simple.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3257 | 0.0000 |
| so_extraction | fs_nova_simple.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 2.3495 | 1.0000 |
| so_extraction | fs_nova_simple.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 9.4151 | 1.0000 |
| so_extraction | fs_nova_simple.json | openai:4.1 | 0 | 1 | 1.0000 | 1.1721 | 1.0000 |
| so_extraction | fs_nova_simple.json | openai:5-mini | 0 | 1 | 1.0000 | 6.3458 | 1.0000 |
| so_extraction | fs_nova_simple.json | openai:5.2 | 0 | 1 | 1.0000 | 1.8342 | 1.0000 |
| so_extraction | fs_nova_simple.json | openai:5.4 | 0 | 1 | 1.0000 | 1.6388 | 1.0000 |
| so_extraction | fs_nova_simple.json | opus-4-5 | 0 | 1 | 1.0000 | 3.2021 | 1.0000 |
| so_extraction | fs_nova_simple.json | opus-4-6 | 0 | 1 | 1.0000 | 2.9481 | 1.0000 |
| so_extraction | fs_nova_simple.json | sonnet-4-5 | 0 | 1 | 1.0000 | 2.2102 | 1.0000 |
| so_extraction | fs_nova_simple.json | sonnet-4-6 | 0 | 1 | 1.0000 | 2.2571 | 1.0000 |
| so_extraction | generated_acme_foods_001.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 7.3799 | 6.0000 |
| so_extraction | generated_acme_foods_001.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 17.4138 | 5.0000 |
| so_extraction | generated_acme_foods_001.json | openai:4.1 | 0 | 1 | 1.0000 | 1.6754 | 2.0000 |
| so_extraction | generated_acme_foods_001.json | openai:5-mini | 0 | 1 | 1.0000 | 22.8033 | 5.0000 |
| so_extraction | generated_acme_foods_001.json | openai:5.2 | 0 | 1 | 1.0000 | 1.9828 | 4.0000 |
| so_extraction | generated_acme_foods_001.json | openai:5.4 | 0 | 1 | 1.0000 | 2.1982 | 10.0000 |
| so_extraction | generated_acme_foods_001.json | opus-4-5 | 0 | 1 | 1.0000 | 4.6399 | 2.0000 |
| so_extraction | generated_acme_foods_001.json | opus-4-6 | 0 | 1 | 1.0000 | 4.9475 | 2.0000 |
| so_extraction | generated_acme_foods_001.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.1543 | 3.0000 |
| so_extraction | generated_acme_foods_001.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.4829 | 0.0000 |
| so_extraction | generated_acme_foods_002.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 16.6114 | 6.0000 |
| so_extraction | generated_acme_foods_002.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 21.6172 | 4.0000 |
| so_extraction | generated_acme_foods_002.json | openai:4.1 | 0 | 1 | 1.0000 | 1.6609 | 1.0000 |
| so_extraction | generated_acme_foods_002.json | openai:5-mini | 0 | 1 | 1.0000 | 23.2564 | 3.0000 |
| so_extraction | generated_acme_foods_002.json | openai:5.2 | 0 | 1 | 1.0000 | 2.8699 | 4.0000 |
| so_extraction | generated_acme_foods_002.json | openai:5.4 | 0 | 1 | 1.0000 | 3.0159 | 2.0000 |
| so_extraction | generated_acme_foods_002.json | opus-4-5 | 0 | 1 | 1.0000 | 6.9093 | 0.0000 |
| so_extraction | generated_acme_foods_002.json | opus-4-6 | 0 | 1 | 1.0000 | 6.8596 | 0.0000 |
| so_extraction | generated_acme_foods_002.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.8389 | 1.0000 |
| so_extraction | generated_acme_foods_002.json | sonnet-4-6 | 0 | 1 | 1.0000 | 5.0953 | 0.0000 |
| so_extraction | generated_acme_foods_003.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 14.9582 | 7.0000 |
| so_extraction | generated_acme_foods_003.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 30.5756 | 4.0000 |
| so_extraction | generated_acme_foods_003.json | openai:4.1 | 0 | 1 | 1.0000 | 1.1413 | 3.0000 |
| so_extraction | generated_acme_foods_003.json | openai:5-mini | 0 | 1 | 1.0000 | 20.3155 | 2.0000 |
| so_extraction | generated_acme_foods_003.json | openai:5.2 | 0 | 1 | 1.0000 | 2.4039 | 3.0000 |
| so_extraction | generated_acme_foods_003.json | openai:5.4 | 0 | 1 | 1.0000 | 1.9078 | 3.0000 |
| so_extraction | generated_acme_foods_003.json | opus-4-5 | 0 | 1 | 1.0000 | 4.8388 | 1.0000 |
| so_extraction | generated_acme_foods_003.json | opus-4-6 | 0 | 1 | 1.0000 | 4.7153 | 1.0000 |
| so_extraction | generated_acme_foods_003.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.4947 | 1.0000 |
| so_extraction | generated_acme_foods_003.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3361 | 0.0000 |
| so_extraction | generated_acme_foods_004.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 12.2684 | 5.0000 |
| so_extraction | generated_acme_foods_004.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 26.3929 | 4.0000 |
| so_extraction | generated_acme_foods_004.json | openai:4.1 | 0 | 1 | 1.0000 | 1.4863 | 1.0000 |
| so_extraction | generated_acme_foods_004.json | openai:5-mini | 0 | 1 | 1.0000 | 18.5843 | 4.0000 |
| so_extraction | generated_acme_foods_004.json | openai:5.2 | 0 | 1 | 1.0000 | 1.9359 | 6.0000 |
| so_extraction | generated_acme_foods_004.json | openai:5.4 | 0 | 1 | 1.0000 | 2.3360 | 5.0000 |
| so_extraction | generated_acme_foods_004.json | opus-4-5 | 0 | 1 | 1.0000 | 4.7882 | 1.0000 |
| so_extraction | generated_acme_foods_004.json | opus-4-6 | 0 | 1 | 1.0000 | 5.3375 | 2.0000 |
| so_extraction | generated_acme_foods_004.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.5308 | 4.0000 |
| so_extraction | generated_acme_foods_004.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.1569 | 0.0000 |
| so_extraction | generated_acme_foods_005.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 13.1000 | 3.0000 |
| so_extraction | generated_acme_foods_005.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 18.9218 | 2.0000 |
| so_extraction | generated_acme_foods_005.json | openai:4.1 | 0 | 1 | 1.0000 | 1.8184 | 0.0000 |
| so_extraction | generated_acme_foods_005.json | openai:5-mini | 0 | 1 | 1.0000 | 17.0029 | 2.0000 |
| so_extraction | generated_acme_foods_005.json | openai:5.2 | 0 | 1 | 1.0000 | 2.4423 | 4.0000 |
| so_extraction | generated_acme_foods_005.json | openai:5.4 | 0 | 1 | 1.0000 | 1.9070 | 2.0000 |
| so_extraction | generated_acme_foods_005.json | opus-4-5 | 0 | 1 | 1.0000 | 4.4722 | 0.0000 |
| so_extraction | generated_acme_foods_005.json | opus-4-6 | 0 | 1 | 1.0000 | 4.6645 | 0.0000 |
| so_extraction | generated_acme_foods_005.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.0587 | 0.0000 |
| so_extraction | generated_acme_foods_005.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.8454 | 0.0000 |
| so_extraction | generated_acme_foods_006.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 11.4413 | 6.0000 |
| so_extraction | generated_acme_foods_006.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 24.1392 | 5.0000 |
| so_extraction | generated_acme_foods_006.json | openai:4.1 | 0 | 1 | 1.0000 | 1.9835 | 0.0000 |
| so_extraction | generated_acme_foods_006.json | openai:5-mini | 0 | 1 | 1.0000 | 18.6907 | 2.0000 |
| so_extraction | generated_acme_foods_006.json | openai:5.2 | 0 | 1 | 1.0000 | 3.2016 | 2.0000 |
| so_extraction | generated_acme_foods_006.json | openai:5.4 | 0 | 1 | 1.0000 | 2.9571 | 2.0000 |
| so_extraction | generated_acme_foods_006.json | opus-4-5 | 0 | 1 | 1.0000 | 4.4893 | 0.0000 |
| so_extraction | generated_acme_foods_006.json | opus-4-6 | 0 | 1 | 1.0000 | 4.2869 | 0.0000 |
| so_extraction | generated_acme_foods_006.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.4302 | 0.0000 |
| so_extraction | generated_acme_foods_006.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.7480 | 0.0000 |
| so_extraction | generated_acme_foods_007.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 8.4728 | 2.0000 |
| so_extraction | generated_acme_foods_007.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 19.3067 | 3.0000 |
| so_extraction | generated_acme_foods_007.json | openai:4.1 | 0 | 1 | 1.0000 | 1.3879 | 0.0000 |
| so_extraction | generated_acme_foods_007.json | openai:5-mini | 0 | 1 | 1.0000 | 17.1148 | 4.0000 |
| so_extraction | generated_acme_foods_007.json | openai:5.2 | 0 | 1 | 1.0000 | 3.0080 | 3.0000 |
| so_extraction | generated_acme_foods_007.json | openai:5.4 | 0 | 1 | 1.0000 | 2.1861 | 2.0000 |
| so_extraction | generated_acme_foods_007.json | opus-4-5 | 0 | 1 | 1.0000 | 4.7055 | 0.0000 |
| so_extraction | generated_acme_foods_007.json | opus-4-6 | 0 | 1 | 1.0000 | 4.7234 | 0.0000 |
| so_extraction | generated_acme_foods_007.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.9869 | 0.0000 |
| so_extraction | generated_acme_foods_007.json | sonnet-4-6 | 0 | 1 | 1.0000 | 5.4660 | 0.0000 |
| so_extraction | generated_acme_foods_008.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 13.0931 | 1.0000 |
| so_extraction | generated_acme_foods_008.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 16.6897 | 2.0000 |
| so_extraction | generated_acme_foods_008.json | openai:4.1 | 0 | 1 | 1.0000 | 1.2331 | 0.0000 |
| so_extraction | generated_acme_foods_008.json | openai:5-mini | 0 | 1 | 1.0000 | 19.3367 | 4.0000 |
| so_extraction | generated_acme_foods_008.json | openai:5.2 | 0 | 1 | 1.0000 | 2.1675 | 2.0000 |
| so_extraction | generated_acme_foods_008.json | openai:5.4 | 0 | 1 | 1.0000 | 2.2745 | 2.0000 |
| so_extraction | generated_acme_foods_008.json | opus-4-5 | 0 | 1 | 1.0000 | 4.7570 | 0.0000 |
| so_extraction | generated_acme_foods_008.json | opus-4-6 | 0 | 1 | 1.0000 | 4.6774 | 0.0000 |
| so_extraction | generated_acme_foods_008.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.7546 | 0.0000 |
| so_extraction | generated_acme_foods_008.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.8862 | 0.0000 |
| so_extraction | generated_acme_foods_009.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 12.9822 | 4.0000 |
| so_extraction | generated_acme_foods_009.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 18.1983 | 4.0000 |
| so_extraction | generated_acme_foods_009.json | openai:4.1 | 0 | 1 | 1.0000 | 1.5970 | 0.0000 |
| so_extraction | generated_acme_foods_009.json | openai:5-mini | 0 | 1 | 1.0000 | 15.4396 | 2.0000 |
| so_extraction | generated_acme_foods_009.json | openai:5.2 | 0 | 1 | 1.0000 | 2.2270 | 3.0000 |
| so_extraction | generated_acme_foods_009.json | openai:5.4 | 0 | 1 | 1.0000 | 1.9163 | 2.0000 |
| so_extraction | generated_acme_foods_009.json | opus-4-5 | 0 | 1 | 1.0000 | 4.6876 | 0.0000 |
| so_extraction | generated_acme_foods_009.json | opus-4-6 | 0 | 1 | 1.0000 | 5.2735 | 0.0000 |
| so_extraction | generated_acme_foods_009.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.0860 | 0.0000 |
| so_extraction | generated_acme_foods_009.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.0713 | 0.0000 |
| so_extraction | generated_acme_foods_010.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 18.7405 | 5.0000 |
| so_extraction | generated_acme_foods_010.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 21.4434 | 1.0000 |
| so_extraction | generated_acme_foods_010.json | openai:4.1 | 0 | 1 | 1.0000 | 1.4940 | 2.0000 |
| so_extraction | generated_acme_foods_010.json | openai:5-mini | 0 | 1 | 1.0000 | 14.4249 | 4.0000 |
| so_extraction | generated_acme_foods_010.json | openai:5.2 | 0 | 1 | 1.0000 | 2.2328 | 6.0000 |
| so_extraction | generated_acme_foods_010.json | openai:5.4 | 0 | 1 | 1.0000 | 2.6585 | 2.0000 |
| so_extraction | generated_acme_foods_010.json | opus-4-5 | 0 | 1 | 1.0000 | 5.8735 | 0.0000 |
| so_extraction | generated_acme_foods_010.json | opus-4-6 | 0 | 1 | 1.0000 | 8.4032 | 0.0000 |
| so_extraction | generated_acme_foods_010.json | sonnet-4-5 | 0 | 1 | 1.0000 | 6.2274 | 1.0000 |
| so_extraction | generated_acme_foods_010.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.7543 | 0.0000 |
| so_extraction | generated_nova_exports_001.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 10.1705 | 6.0000 |
| so_extraction | generated_nova_exports_001.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 15.9954 | 4.0000 |
| so_extraction | generated_nova_exports_001.json | openai:4.1 | 0 | 1 | 1.0000 | 1.7341 | 2.0000 |
| so_extraction | generated_nova_exports_001.json | openai:5-mini | 0 | 1 | 1.0000 | 23.0455 | 3.0000 |
| so_extraction | generated_nova_exports_001.json | openai:5.2 | 0 | 1 | 1.0000 | 2.2121 | 2.0000 |
| so_extraction | generated_nova_exports_001.json | openai:5.4 | 0 | 1 | 1.0000 | 2.1248 | 2.0000 |
| so_extraction | generated_nova_exports_001.json | opus-4-5 | 0 | 1 | 1.0000 | 4.6349 | 2.0000 |
| so_extraction | generated_nova_exports_001.json | opus-4-6 | 0 | 1 | 1.0000 | 4.9475 | 2.0000 |
| so_extraction | generated_nova_exports_001.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.4487 | 3.0000 |
| so_extraction | generated_nova_exports_001.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3903 | 2.0000 |
| so_extraction | generated_nova_exports_002.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 13.3423 | 3.0000 |
| so_extraction | generated_nova_exports_002.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 19.2992 | 1.0000 |
| so_extraction | generated_nova_exports_002.json | openai:4.1 | 0 | 1 | 1.0000 | 1.4286 | 2.0000 |
| so_extraction | generated_nova_exports_002.json | openai:5-mini | 0 | 1 | 1.0000 | 20.2700 | 6.0000 |
| so_extraction | generated_nova_exports_002.json | openai:5.2 | 0 | 1 | 1.0000 | 2.2197 | 4.0000 |
| so_extraction | generated_nova_exports_002.json | openai:5.4 | 0 | 1 | 1.0000 | 2.5463 | 2.0000 |
| so_extraction | generated_nova_exports_002.json | opus-4-5 | 0 | 1 | 1.0000 | 5.9052 | 0.0000 |
| so_extraction | generated_nova_exports_002.json | opus-4-6 | 0 | 1 | 1.0000 | 5.8852 | 0.0000 |
| so_extraction | generated_nova_exports_002.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.0715 | 1.0000 |
| so_extraction | generated_nova_exports_002.json | sonnet-4-6 | 0 | 1 | 1.0000 | 5.1104 | 0.0000 |
| so_extraction | generated_nova_exports_003.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 12.2260 | 7.0000 |
| so_extraction | generated_nova_exports_003.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 17.4064 | 9.0000 |
| so_extraction | generated_nova_exports_003.json | openai:4.1 | 0 | 1 | 1.0000 | 1.2030 | 1.0000 |
| so_extraction | generated_nova_exports_003.json | openai:5-mini | 0 | 1 | 1.0000 | 11.2759 | 3.0000 |
| so_extraction | generated_nova_exports_003.json | openai:5.2 | 0 | 1 | 1.0000 | 2.1200 | 3.0000 |
| so_extraction | generated_nova_exports_003.json | openai:5.4 | 0 | 1 | 1.0000 | 1.9867 | 5.0000 |
| so_extraction | generated_nova_exports_003.json | opus-4-5 | 0 | 1 | 1.0000 | 4.5789 | 1.0000 |
| so_extraction | generated_nova_exports_003.json | opus-4-6 | 0 | 1 | 1.0000 | 4.9207 | 1.0000 |
| so_extraction | generated_nova_exports_003.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.4431 | 2.0000 |
| so_extraction | generated_nova_exports_003.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3251 | 0.0000 |
| so_extraction | generated_nova_exports_004.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 12.9219 | 6.0000 |
| so_extraction | generated_nova_exports_004.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 17.7439 | 5.0000 |
| so_extraction | generated_nova_exports_004.json | openai:4.1 | 0 | 1 | 1.0000 | 1.8232 | 1.0000 |
| so_extraction | generated_nova_exports_004.json | openai:5-mini | 0 | 1 | 1.0000 | 18.3899 | 6.0000 |
| so_extraction | generated_nova_exports_004.json | openai:5.2 | 0 | 1 | 1.0000 | 1.7227 | 5.0000 |
| so_extraction | generated_nova_exports_004.json | openai:5.4 | 0 | 1 | 1.0000 | 1.8303 | 5.0000 |
| so_extraction | generated_nova_exports_004.json | opus-4-5 | 0 | 1 | 1.0000 | 4.9283 | 1.0000 |
| so_extraction | generated_nova_exports_004.json | opus-4-6 | 0 | 1 | 1.0000 | 4.8843 | 3.0000 |
| so_extraction | generated_nova_exports_004.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.8213 | 4.0000 |
| so_extraction | generated_nova_exports_004.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.0740 | 0.0000 |
| so_extraction | generated_nova_exports_005.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 10.1177 | 1.0000 |
| so_extraction | generated_nova_exports_005.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 15.8833 | 3.0000 |
| so_extraction | generated_nova_exports_005.json | openai:4.1 | 0 | 1 | 1.0000 | 1.1521 | 0.0000 |
| so_extraction | generated_nova_exports_005.json | openai:5-mini | 0 | 1 | 1.0000 | 17.9456 | 2.0000 |
| so_extraction | generated_nova_exports_005.json | openai:5.2 | 0 | 1 | 1.0000 | 3.6980 | 2.0000 |
| so_extraction | generated_nova_exports_005.json | openai:5.4 | 0 | 1 | 1.0000 | 1.8547 | 2.0000 |
| so_extraction | generated_nova_exports_005.json | opus-4-5 | 0 | 1 | 1.0000 | 4.3729 | 0.0000 |
| so_extraction | generated_nova_exports_005.json | opus-4-6 | 0 | 1 | 1.0000 | 5.4155 | 0.0000 |
| so_extraction | generated_nova_exports_005.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.2417 | 1.0000 |
| so_extraction | generated_nova_exports_005.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.2292 | 0.0000 |
| so_extraction | generated_nova_exports_006.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 5.0896 | 3.0000 |
| so_extraction | generated_nova_exports_006.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 21.4711 | 3.0000 |
| so_extraction | generated_nova_exports_006.json | openai:4.1 | 0 | 1 | 1.0000 | 1.4011 | 1.0000 |
| so_extraction | generated_nova_exports_006.json | openai:5-mini | 0 | 1 | 1.0000 | 16.7322 | 3.0000 |
| so_extraction | generated_nova_exports_006.json | openai:5.2 | 0 | 1 | 1.0000 | 2.0074 | 2.0000 |
| so_extraction | generated_nova_exports_006.json | openai:5.4 | 0 | 1 | 1.0000 | 2.2382 | 2.0000 |
| so_extraction | generated_nova_exports_006.json | opus-4-5 | 0 | 1 | 1.0000 | 4.6257 | 0.0000 |
| so_extraction | generated_nova_exports_006.json | opus-4-6 | 0 | 1 | 1.0000 | 5.1128 | 0.0000 |
| so_extraction | generated_nova_exports_006.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.4110 | 1.0000 |
| so_extraction | generated_nova_exports_006.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.2117 | 0.0000 |
| so_extraction | generated_nova_exports_007.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 11.0278 | 1.0000 |
| so_extraction | generated_nova_exports_007.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 12.7048 | 0.0000 |
| so_extraction | generated_nova_exports_007.json | openai:4.1 | 0 | 1 | 1.0000 | 1.3488 | 1.0000 |
| so_extraction | generated_nova_exports_007.json | openai:5-mini | 0 | 1 | 1.0000 | 29.3463 | 3.0000 |
| so_extraction | generated_nova_exports_007.json | openai:5.2 | 0 | 1 | 1.0000 | 1.6647 | 3.0000 |
| so_extraction | generated_nova_exports_007.json | openai:5.4 | 0 | 1 | 1.0000 | 2.1860 | 3.0000 |
| so_extraction | generated_nova_exports_007.json | opus-4-5 | 0 | 1 | 1.0000 | 4.8505 | 0.0000 |
| so_extraction | generated_nova_exports_007.json | opus-4-6 | 0 | 1 | 1.0000 | 5.0850 | 0.0000 |
| so_extraction | generated_nova_exports_007.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.0717 | 0.0000 |
| so_extraction | generated_nova_exports_007.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.2295 | 0.0000 |
| so_extraction | generated_nova_exports_008.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 19.6377 | 2.0000 |
| so_extraction | generated_nova_exports_008.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 18.3525 | 4.0000 |
| so_extraction | generated_nova_exports_008.json | openai:4.1 | 0 | 1 | 1.0000 | 1.7157 | 0.0000 |
| so_extraction | generated_nova_exports_008.json | openai:5-mini | 0 | 1 | 1.0000 | 26.3901 | 2.0000 |
| so_extraction | generated_nova_exports_008.json | openai:5.2 | 0 | 1 | 1.0000 | 3.5669 | 2.0000 |
| so_extraction | generated_nova_exports_008.json | openai:5.4 | 0 | 1 | 1.0000 | 1.8317 | 2.0000 |
| so_extraction | generated_nova_exports_008.json | opus-4-5 | 0 | 1 | 1.0000 | 4.4424 | 0.0000 |
| so_extraction | generated_nova_exports_008.json | opus-4-6 | 0 | 1 | 1.0000 | 4.8895 | 0.0000 |
| so_extraction | generated_nova_exports_008.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.4788 | 1.0000 |
| so_extraction | generated_nova_exports_008.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.1210 | 0.0000 |
| so_extraction | generated_nova_exports_009.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 9.6482 | 2.0000 |
| so_extraction | generated_nova_exports_009.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 19.4427 | 2.0000 |
| so_extraction | generated_nova_exports_009.json | openai:4.1 | 0 | 1 | 1.0000 | 1.3685 | 0.0000 |
| so_extraction | generated_nova_exports_009.json | openai:5-mini | 0 | 1 | 1.0000 | 20.1629 | 2.0000 |
| so_extraction | generated_nova_exports_009.json | openai:5.2 | 0 | 1 | 1.0000 | 2.0380 | 2.0000 |
| so_extraction | generated_nova_exports_009.json | openai:5.4 | 0 | 1 | 1.0000 | 2.1798 | 2.0000 |
| so_extraction | generated_nova_exports_009.json | opus-4-5 | 0 | 1 | 1.0000 | 4.6709 | 0.0000 |
| so_extraction | generated_nova_exports_009.json | opus-4-6 | 0 | 1 | 1.0000 | 4.8325 | 0.0000 |
| so_extraction | generated_nova_exports_009.json | sonnet-4-5 | 0 | 1 | 1.0000 | 3.9743 | 1.0000 |
| so_extraction | generated_nova_exports_009.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.5180 | 0.0000 |
| so_extraction | generated_nova_exports_010.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 8.3614 | 4.0000 |
| so_extraction | generated_nova_exports_010.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 14.8309 | 3.0000 |
| so_extraction | generated_nova_exports_010.json | openai:4.1 | 0 | 1 | 1.0000 | 1.8049 | 4.0000 |
| so_extraction | generated_nova_exports_010.json | openai:5-mini | 0 | 1 | 1.0000 | 15.7686 | 4.0000 |
| so_extraction | generated_nova_exports_010.json | openai:5.2 | 0 | 1 | 1.0000 | 2.4636 | 4.0000 |
| so_extraction | generated_nova_exports_010.json | openai:5.4 | 0 | 1 | 1.0000 | 2.2039 | 2.0000 |
| so_extraction | generated_nova_exports_010.json | opus-4-5 | 0 | 1 | 1.0000 | 6.2910 | 0.0000 |
| so_extraction | generated_nova_exports_010.json | opus-4-6 | 0 | 1 | 1.0000 | 5.9550 | 0.0000 |
| so_extraction | generated_nova_exports_010.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.0502 | 1.0000 |
| so_extraction | generated_nova_exports_010.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.9001 | 0.0000 |
| so_extraction | realistic_acme_foods_001.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 6.1146 | 7.0000 |
| so_extraction | realistic_acme_foods_001.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 18.4731 | 6.0000 |
| so_extraction | realistic_acme_foods_001.json | openai:4.1 | 0 | 1 | 1.0000 | 1.1193 | 3.0000 |
| so_extraction | realistic_acme_foods_001.json | openai:5-mini | 0 | 1 | 1.0000 | 28.6586 | 3.0000 |
| so_extraction | realistic_acme_foods_001.json | openai:5.2 | 0 | 1 | 1.0000 | 3.0003 | 5.0000 |
| so_extraction | realistic_acme_foods_001.json | openai:5.4 | 0 | 1 | 1.0000 | 2.0924 | 7.0000 |
| so_extraction | realistic_acme_foods_001.json | opus-4-5 | 0 | 1 | 1.0000 | 4.7882 | 2.0000 |
| so_extraction | realistic_acme_foods_001.json | opus-4-6 | 0 | 1 | 1.0000 | 4.6929 | 2.0000 |
| so_extraction | realistic_acme_foods_001.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.8481 | 3.0000 |
| so_extraction | realistic_acme_foods_001.json | sonnet-4-6 | 0 | 1 | 1.0000 | 3.9593 | 0.0000 |
| so_extraction | realistic_acme_foods_002.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 10.2994 | 5.0000 |
| so_extraction | realistic_acme_foods_002.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 17.8389 | 3.0000 |
| so_extraction | realistic_acme_foods_002.json | openai:4.1 | 0 | 1 | 1.0000 | 1.2707 | 1.0000 |
| so_extraction | realistic_acme_foods_002.json | openai:5-mini | 0 | 1 | 1.0000 | 26.7388 | 4.0000 |
| so_extraction | realistic_acme_foods_002.json | openai:5.2 | 0 | 1 | 1.0000 | 7.9718 | 4.0000 |
| so_extraction | realistic_acme_foods_002.json | openai:5.4 | 0 | 1 | 1.0000 | 2.0150 | 4.0000 |
| so_extraction | realistic_acme_foods_002.json | opus-4-5 | 0 | 1 | 1.0000 | 4.9555 | 0.0000 |
| so_extraction | realistic_acme_foods_002.json | opus-4-6 | 0 | 1 | 1.0000 | 5.2631 | 0.0000 |
| so_extraction | realistic_acme_foods_002.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.5517 | 0.0000 |
| so_extraction | realistic_acme_foods_002.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3146 | 0.0000 |
| so_extraction | realistic_acme_foods_003.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 12.9914 | 3.0000 |
| so_extraction | realistic_acme_foods_003.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 23.3319 | 3.0000 |
| so_extraction | realistic_acme_foods_003.json | openai:4.1 | 0 | 1 | 1.0000 | 1.6483 | 4.0000 |
| so_extraction | realistic_acme_foods_003.json | openai:5-mini | 0 | 1 | 1.0000 | 15.4838 | 3.0000 |
| so_extraction | realistic_acme_foods_003.json | openai:5.2 | 0 | 1 | 1.0000 | 11.0126 | 3.0000 |
| so_extraction | realistic_acme_foods_003.json | openai:5.4 | 0 | 1 | 1.0000 | 2.0606 | 3.0000 |
| so_extraction | realistic_acme_foods_003.json | opus-4-5 | 0 | 1 | 1.0000 | 5.0083 | 0.0000 |
| so_extraction | realistic_acme_foods_003.json | opus-4-6 | 0 | 1 | 1.0000 | 4.4763 | 1.0000 |
| so_extraction | realistic_acme_foods_003.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.1627 | 1.0000 |
| so_extraction | realistic_acme_foods_003.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.0330 | 0.0000 |
| so_extraction | realistic_acme_foods_004.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 10.1684 | 2.0000 |
| so_extraction | realistic_acme_foods_004.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 20.4846 | 6.0000 |
| so_extraction | realistic_acme_foods_004.json | openai:4.1 | 0 | 1 | 1.0000 | 1.2901 | 4.0000 |
| so_extraction | realistic_acme_foods_004.json | openai:5-mini | 0 | 1 | 1.0000 | 23.3948 | 6.0000 |
| so_extraction | realistic_acme_foods_004.json | openai:5.2 | 0 | 1 | 1.0000 | 8.7824 | 5.0000 |
| so_extraction | realistic_acme_foods_004.json | openai:5.4 | 0 | 1 | 1.0000 | 2.0444 | 3.0000 |
| so_extraction | realistic_acme_foods_004.json | opus-4-5 | 0 | 1 | 1.0000 | 4.5648 | 0.0000 |
| so_extraction | realistic_acme_foods_004.json | opus-4-6 | 0 | 1 | 1.0000 | 4.2712 | 0.0000 |
| so_extraction | realistic_acme_foods_004.json | sonnet-4-5 | 0 | 1 | 1.0000 | 2.6392 | 1.0000 |
| so_extraction | realistic_acme_foods_004.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.1950 | 0.0000 |
| so_extraction | realistic_acme_foods_005.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 20.2311 | 2.0000 |
| so_extraction | realistic_acme_foods_005.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 27.4563 | 6.0000 |
| so_extraction | realistic_acme_foods_005.json | openai:4.1 | 0 | 1 | 1.0000 | 1.2396 | 6.0000 |
| so_extraction | realistic_acme_foods_005.json | openai:5-mini | 0 | 1 | 1.0000 | 25.6974 | 3.0000 |
| so_extraction | realistic_acme_foods_005.json | openai:5.2 | 0 | 1 | 1.0000 | 12.6323 | 6.0000 |
| so_extraction | realistic_acme_foods_005.json | openai:5.4 | 0 | 1 | 1.0000 | 1.9805 | 4.0000 |
| so_extraction | realistic_acme_foods_005.json | opus-4-5 | 0 | 1 | 1.0000 | 5.2741 | 2.0000 |
| so_extraction | realistic_acme_foods_005.json | opus-4-6 | 0 | 1 | 1.0000 | 5.0390 | 3.0000 |
| so_extraction | realistic_acme_foods_005.json | sonnet-4-5 | 0 | 1 | 1.0000 | 2.7638 | 1.0000 |
| so_extraction | realistic_acme_foods_005.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3875 | 1.0000 |
| so_extraction | realistic_acme_foods_006.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 12.7844 | 2.0000 |
| so_extraction | realistic_acme_foods_006.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 23.9075 | 0.0000 |
| so_extraction | realistic_acme_foods_006.json | openai:4.1 | 0 | 1 | 1.0000 | 1.2162 | 0.0000 |
| so_extraction | realistic_acme_foods_006.json | openai:5-mini | 0 | 1 | 1.0000 | 1.9610 | 3.0000 |
| so_extraction | realistic_acme_foods_006.json | openai:5.2 | 0 | 1 | 1.0000 | 2.0838 | 3.0000 |
| so_extraction | realistic_acme_foods_006.json | openai:5.4 | 0 | 1 | 1.0000 | 1.7070 | 2.0000 |
| so_extraction | realistic_acme_foods_006.json | opus-4-5 | 0 | 1 | 1.0000 | 5.1456 | 0.0000 |
| so_extraction | realistic_acme_foods_006.json | opus-4-6 | 0 | 1 | 1.0000 | 4.9930 | 0.0000 |
| so_extraction | realistic_acme_foods_006.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.8872 | 0.0000 |
| so_extraction | realistic_acme_foods_006.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.1303 | 0.0000 |
| so_extraction | realistic_acme_foods_007.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 17.5073 | 8.0000 |
| so_extraction | realistic_acme_foods_007.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 21.9423 | 6.0000 |
| so_extraction | realistic_acme_foods_007.json | openai:4.1 | 0 | 1 | 1.0000 | 1.3549 | 5.0000 |
| so_extraction | realistic_acme_foods_007.json | openai:5-mini | 0 | 1 | 1.0000 | 28.0184 | 6.0000 |
| so_extraction | realistic_acme_foods_007.json | openai:5.2 | 0 | 1 | 1.0000 | 3.1587 | 6.0000 |
| so_extraction | realistic_acme_foods_007.json | openai:5.4 | 0 | 1 | 1.0000 | 1.9197 | 7.0000 |
| so_extraction | realistic_acme_foods_007.json | opus-4-5 | 0 | 1 | 1.0000 | 5.2392 | 3.0000 |
| so_extraction | realistic_acme_foods_007.json | opus-4-6 | 0 | 1 | 1.0000 | 5.6859 | 3.0000 |
| so_extraction | realistic_acme_foods_007.json | sonnet-4-5 | 0 | 1 | 1.0000 | 3.9871 | 3.0000 |
| so_extraction | realistic_acme_foods_007.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3239 | 0.0000 |
| so_extraction | realistic_acme_foods_008.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 20.4202 | 5.0000 |
| so_extraction | realistic_acme_foods_008.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 19.4550 | 3.0000 |
| so_extraction | realistic_acme_foods_008.json | openai:4.1 | 0 | 1 | 1.0000 | 1.3968 | 4.0000 |
| so_extraction | realistic_acme_foods_008.json | openai:5-mini | 0 | 1 | 1.0000 | 23.6305 | 5.0000 |
| so_extraction | realistic_acme_foods_008.json | openai:5.2 | 0 | 1 | 1.0000 | 4.6057 | 6.0000 |
| so_extraction | realistic_acme_foods_008.json | openai:5.4 | 0 | 1 | 1.0000 | 2.0218 | 5.0000 |
| so_extraction | realistic_acme_foods_008.json | opus-4-5 | 0 | 1 | 1.0000 | 5.4563 | 3.0000 |
| so_extraction | realistic_acme_foods_008.json | opus-4-6 | 0 | 1 | 1.0000 | 4.6696 | 3.0000 |
| so_extraction | realistic_acme_foods_008.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.8965 | 3.0000 |
| so_extraction | realistic_acme_foods_008.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.2515 | 0.0000 |
| so_extraction | realistic_acme_foods_009.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 14.9120 | 6.0000 |
| so_extraction | realistic_acme_foods_009.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 24.9335 | 7.0000 |
| so_extraction | realistic_acme_foods_009.json | openai:4.1 | 0 | 1 | 1.0000 | 1.5470 | 4.0000 |
| so_extraction | realistic_acme_foods_009.json | openai:5-mini | 0 | 1 | 1.0000 | 27.8516 | 5.0000 |
| so_extraction | realistic_acme_foods_009.json | openai:5.2 | 0 | 1 | 1.0000 | 4.3317 | 4.0000 |
| so_extraction | realistic_acme_foods_009.json | openai:5.4 | 0 | 1 | 1.0000 | 2.0511 | 3.0000 |
| so_extraction | realistic_acme_foods_009.json | opus-4-5 | 0 | 1 | 1.0000 | 5.5710 | 1.0000 |
| so_extraction | realistic_acme_foods_009.json | opus-4-6 | 0 | 1 | 1.0000 | 4.4012 | 0.0000 |
| so_extraction | realistic_acme_foods_009.json | sonnet-4-5 | 0 | 1 | 1.0000 | 2.5007 | 1.0000 |
| so_extraction | realistic_acme_foods_009.json | sonnet-4-6 | 0 | 1 | 1.0000 | 3.9493 | 0.0000 |
| so_extraction | realistic_acme_foods_010.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 9.1193 | 1.0000 |
| so_extraction | realistic_acme_foods_010.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 26.0655 | 2.0000 |
| so_extraction | realistic_acme_foods_010.json | openai:4.1 | 0 | 1 | 1.0000 | 1.4657 | 3.0000 |
| so_extraction | realistic_acme_foods_010.json | openai:5-mini | 0 | 1 | 1.0000 | 16.9028 | 5.0000 |
| so_extraction | realistic_acme_foods_010.json | openai:5.2 | 0 | 1 | 1.0000 | 2.4674 | 3.0000 |
| so_extraction | realistic_acme_foods_010.json | openai:5.4 | 0 | 1 | 1.0000 | 1.9987 | 2.0000 |
| so_extraction | realistic_acme_foods_010.json | opus-4-5 | 0 | 1 | 1.0000 | 4.4720 | 0.0000 |
| so_extraction | realistic_acme_foods_010.json | opus-4-6 | 0 | 1 | 1.0000 | 4.8886 | 0.0000 |
| so_extraction | realistic_acme_foods_010.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.1570 | 0.0000 |
| so_extraction | realistic_acme_foods_010.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3430 | 0.0000 |
| so_extraction | realistic_nova_exports_001.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 16.1502 | 4.0000 |
| so_extraction | realistic_nova_exports_001.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 18.2972 | 5.0000 |
| so_extraction | realistic_nova_exports_001.json | openai:4.1 | 0 | 1 | 1.0000 | 5.2296 | 3.0000 |
| so_extraction | realistic_nova_exports_001.json | openai:5-mini | 0 | 1 | 1.0000 | 14.3826 | 5.0000 |
| so_extraction | realistic_nova_exports_001.json | openai:5.2 | 0 | 1 | 1.0000 | 1.8066 | 6.0000 |
| so_extraction | realistic_nova_exports_001.json | openai:5.4 | 0 | 1 | 1.0000 | 2.0489 | 5.0000 |
| so_extraction | realistic_nova_exports_001.json | opus-4-5 | 0 | 1 | 1.0000 | 5.1107 | 4.0000 |
| so_extraction | realistic_nova_exports_001.json | opus-4-6 | 0 | 1 | 1.0000 | 4.8549 | 4.0000 |
| so_extraction | realistic_nova_exports_001.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.9928 | 4.0000 |
| so_extraction | realistic_nova_exports_001.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.6004 | 1.0000 |
| so_extraction | realistic_nova_exports_002.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 13.4630 | 5.0000 |
| so_extraction | realistic_nova_exports_002.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 26.8497 | 6.0000 |
| so_extraction | realistic_nova_exports_002.json | openai:4.1 | 0 | 1 | 1.0000 | 1.1512 | 1.0000 |
| so_extraction | realistic_nova_exports_002.json | openai:5-mini | 0 | 1 | 1.0000 | 29.6459 | 5.0000 |
| so_extraction | realistic_nova_exports_002.json | openai:5.2 | 0 | 1 | 1.0000 | 1.8544 | 3.0000 |
| so_extraction | realistic_nova_exports_002.json | openai:5.4 | 0 | 1 | 1.0000 | 1.5424 | 3.0000 |
| so_extraction | realistic_nova_exports_002.json | opus-4-5 | 0 | 1 | 1.0000 | 5.1487 | 1.0000 |
| so_extraction | realistic_nova_exports_002.json | opus-4-6 | 0 | 1 | 1.0000 | 2.9519 | 1.0000 |
| so_extraction | realistic_nova_exports_002.json | sonnet-4-5 | 0 | 1 | 1.0000 | 3.4218 | 1.0000 |
| so_extraction | realistic_nova_exports_002.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.1242 | 0.0000 |
| so_extraction | realistic_nova_exports_003.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 15.5776 | 1.0000 |
| so_extraction | realistic_nova_exports_003.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 22.7992 | 3.0000 |
| so_extraction | realistic_nova_exports_003.json | openai:4.1 | 0 | 1 | 1.0000 | 1.5676 | 2.0000 |
| so_extraction | realistic_nova_exports_003.json | openai:5-mini | 0 | 1 | 1.0000 | 41.8010 | 4.0000 |
| so_extraction | realistic_nova_exports_003.json | openai:5.2 | 0 | 1 | 1.0000 | 2.0607 | 4.0000 |
| so_extraction | realistic_nova_exports_003.json | openai:5.4 | 0 | 1 | 1.0000 | 2.5556 | 3.0000 |
| so_extraction | realistic_nova_exports_003.json | opus-4-5 | 0 | 1 | 1.0000 | 4.6944 | 3.0000 |
| so_extraction | realistic_nova_exports_003.json | opus-4-6 | 0 | 1 | 1.0000 | 4.4312 | 3.0000 |
| so_extraction | realistic_nova_exports_003.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.1307 | 4.0000 |
| so_extraction | realistic_nova_exports_003.json | sonnet-4-6 | 0 | 1 | 1.0000 | 3.8931 | 1.0000 |
| so_extraction | realistic_nova_exports_004.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 8.6169 | 5.0000 |
| so_extraction | realistic_nova_exports_004.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 13.1474 | 4.0000 |
| so_extraction | realistic_nova_exports_004.json | openai:4.1 | 0 | 1 | 1.0000 | 1.6967 | 3.0000 |
| so_extraction | realistic_nova_exports_004.json | openai:5-mini | 0 | 1 | 1.0000 | 30.6593 | 4.0000 |
| so_extraction | realistic_nova_exports_004.json | openai:5.2 | 0 | 1 | 1.0000 | 2.2076 | 3.0000 |
| so_extraction | realistic_nova_exports_004.json | openai:5.4 | 0 | 1 | 1.0000 | 2.4508 | 5.0000 |
| so_extraction | realistic_nova_exports_004.json | opus-4-5 | 0 | 1 | 1.0000 | 4.7815 | 0.0000 |
| so_extraction | realistic_nova_exports_004.json | opus-4-6 | 0 | 1 | 1.0000 | 4.7699 | 0.0000 |
| so_extraction | realistic_nova_exports_004.json | sonnet-4-5 | 0 | 1 | 1.0000 | 2.9631 | 1.0000 |
| so_extraction | realistic_nova_exports_004.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.4248 | 3.0000 |
| so_extraction | realistic_nova_exports_005.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 9.5758 | 4.0000 |
| so_extraction | realistic_nova_exports_005.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 22.9153 | 7.0000 |
| so_extraction | realistic_nova_exports_005.json | openai:4.1 | 0 | 1 | 1.0000 | 1.3192 | 5.0000 |
| so_extraction | realistic_nova_exports_005.json | openai:5-mini | 0 | 1 | 1.0000 | 17.5695 | 4.0000 |
| so_extraction | realistic_nova_exports_005.json | openai:5.2 | 0 | 1 | 1.0000 | 1.9053 | 3.0000 |
| so_extraction | realistic_nova_exports_005.json | openai:5.4 | 0 | 1 | 1.0000 | 2.4769 | 3.0000 |
| so_extraction | realistic_nova_exports_005.json | opus-4-5 | 0 | 1 | 1.0000 | 3.1442 | 1.0000 |
| so_extraction | realistic_nova_exports_005.json | opus-4-6 | 0 | 1 | 1.0000 | 3.2239 | 1.0000 |
| so_extraction | realistic_nova_exports_005.json | sonnet-4-5 | 0 | 1 | 1.0000 | 5.3208 | 2.0000 |
| so_extraction | realistic_nova_exports_005.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.1954 | 0.0000 |
| so_extraction | realistic_nova_exports_006.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 12.5712 | 5.0000 |
| so_extraction | realistic_nova_exports_006.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 20.1784 | 4.0000 |
| so_extraction | realistic_nova_exports_006.json | openai:4.1 | 0 | 1 | 1.0000 | 1.3690 | 1.0000 |
| so_extraction | realistic_nova_exports_006.json | openai:5-mini | 0 | 1 | 1.0000 | 21.7113 | 2.0000 |
| so_extraction | realistic_nova_exports_006.json | openai:5.2 | 0 | 1 | 1.0000 | 1.8672 | 3.0000 |
| so_extraction | realistic_nova_exports_006.json | openai:5.4 | 0 | 1 | 1.0000 | 2.1039 | 3.0000 |
| so_extraction | realistic_nova_exports_006.json | opus-4-5 | 0 | 1 | 1.0000 | 5.7154 | 0.0000 |
| so_extraction | realistic_nova_exports_006.json | opus-4-6 | 0 | 1 | 1.0000 | 4.8042 | 0.0000 |
| so_extraction | realistic_nova_exports_006.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.9521 | 0.0000 |
| so_extraction | realistic_nova_exports_006.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.0719 | 3.0000 |
| so_extraction | realistic_nova_exports_007.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 13.5330 | 3.0000 |
| so_extraction | realistic_nova_exports_007.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 13.9188 | 7.0000 |
| so_extraction | realistic_nova_exports_007.json | openai:4.1 | 0 | 1 | 1.0000 | 1.3290 | 2.0000 |
| so_extraction | realistic_nova_exports_007.json | openai:5-mini | 0 | 1 | 1.0000 | 27.4265 | 5.0000 |
| so_extraction | realistic_nova_exports_007.json | openai:5.2 | 0 | 1 | 1.0000 | 1.7371 | 3.0000 |
| so_extraction | realistic_nova_exports_007.json | openai:5.4 | 0 | 1 | 1.0000 | 1.8682 | 2.0000 |
| so_extraction | realistic_nova_exports_007.json | opus-4-5 | 0 | 1 | 1.0000 | 3.0017 | 1.0000 |
| so_extraction | realistic_nova_exports_007.json | opus-4-6 | 0 | 1 | 1.0000 | 4.4278 | 1.0000 |
| so_extraction | realistic_nova_exports_007.json | sonnet-4-5 | 0 | 1 | 1.0000 | 2.7395 | 1.0000 |
| so_extraction | realistic_nova_exports_007.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.2059 | 0.0000 |
| so_extraction | realistic_nova_exports_008.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 10.5930 | 5.0000 |
| so_extraction | realistic_nova_exports_008.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 11.2706 | 5.0000 |
| so_extraction | realistic_nova_exports_008.json | openai:4.1 | 0 | 1 | 1.0000 | 2.8196 | 2.0000 |
| so_extraction | realistic_nova_exports_008.json | openai:5-mini | 0 | 1 | 1.0000 | 26.7961 | 6.0000 |
| so_extraction | realistic_nova_exports_008.json | openai:5.2 | 0 | 1 | 1.0000 | 2.1278 | 3.0000 |
| so_extraction | realistic_nova_exports_008.json | openai:5.4 | 0 | 1 | 1.0000 | 2.1011 | 2.0000 |
| so_extraction | realistic_nova_exports_008.json | opus-4-5 | 0 | 1 | 1.0000 | 4.9878 | 3.0000 |
| so_extraction | realistic_nova_exports_008.json | opus-4-6 | 0 | 1 | 1.0000 | 5.5318 | 3.0000 |
| so_extraction | realistic_nova_exports_008.json | sonnet-4-5 | 0 | 1 | 1.0000 | 3.3150 | 1.0000 |
| so_extraction | realistic_nova_exports_008.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.1594 | 3.0000 |
| so_extraction | realistic_nova_exports_009.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 17.1132 | 2.0000 |
| so_extraction | realistic_nova_exports_009.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 22.2329 | 7.0000 |
| so_extraction | realistic_nova_exports_009.json | openai:4.1 | 0 | 1 | 1.0000 | 1.3906 | 0.0000 |
| so_extraction | realistic_nova_exports_009.json | openai:5-mini | 0 | 1 | 1.0000 | 13.2795 | 1.0000 |
| so_extraction | realistic_nova_exports_009.json | openai:5.2 | 0 | 1 | 1.0000 | 2.4449 | 3.0000 |
| so_extraction | realistic_nova_exports_009.json | openai:5.4 | 0 | 1 | 1.0000 | 2.1407 | 3.0000 |
| so_extraction | realistic_nova_exports_009.json | opus-4-5 | 0 | 1 | 1.0000 | 4.7577 | 1.0000 |
| so_extraction | realistic_nova_exports_009.json | opus-4-6 | 0 | 1 | 1.0000 | 4.8655 | 0.0000 |
| so_extraction | realistic_nova_exports_009.json | sonnet-4-5 | 0 | 1 | 1.0000 | 4.0259 | 2.0000 |
| so_extraction | realistic_nova_exports_009.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.3658 | 0.0000 |
| so_extraction | realistic_nova_exports_010.json | gemini:gemini-2.5-flash | 0 | 1 | 1.0000 | 16.4949 | 7.0000 |
| so_extraction | realistic_nova_exports_010.json | gemini:gemini-2.5-pro | 0 | 1 | 1.0000 | 15.5362 | 6.0000 |
| so_extraction | realistic_nova_exports_010.json | openai:4.1 | 0 | 1 | 1.0000 | 1.3439 | 2.0000 |
| so_extraction | realistic_nova_exports_010.json | openai:5-mini | 0 | 1 | 1.0000 | 24.1796 | 4.0000 |
| so_extraction | realistic_nova_exports_010.json | openai:5.2 | 0 | 1 | 1.0000 | 1.9014 | 4.0000 |
| so_extraction | realistic_nova_exports_010.json | openai:5.4 | 0 | 1 | 1.0000 | 2.0181 | 3.0000 |
| so_extraction | realistic_nova_exports_010.json | opus-4-5 | 0 | 1 | 1.0000 | 3.6940 | 1.0000 |
| so_extraction | realistic_nova_exports_010.json | opus-4-6 | 0 | 1 | 1.0000 | 4.7903 | 0.0000 |
| so_extraction | realistic_nova_exports_010.json | sonnet-4-5 | 0 | 1 | 1.0000 | 3.2096 | 1.0000 |
| so_extraction | realistic_nova_exports_010.json | sonnet-4-6 | 0 | 1 | 1.0000 | 4.5346 | 0.0000 |
| Agent | Chat | Model | FS count | Mismatches | Sample |
|---|---|---|---|---|---|
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | gemini:gemini-2.5-flash | 0 | 20 | [
{
"path": "data[0].items",
"expected_len": 1,
"actual_len": 3
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | gemini:gemini-2.5-pro | 0 | 20 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 2
},
{
"path": "data[0].items",
"expected_len": 1,
"actual_len": 2
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-5 | 0 | 19 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5-mini | 0 | 18 | [
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/kg"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-6 | 0 | 18 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:4.1 | 0 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:4.1 | 0 | 17 | [
{
"path": "data[0].items",
"expected_len": 1,
"actual_len": 3
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5.4 | 0 | 17 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 3
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-5 | 0 | 17 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 3
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | gemini:gemini-2.5-flash | 0 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:5-mini | 0 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/KG",
"actual": "USD/kg"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
}
] |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:4.1 | 0 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": ""
},
{
"path": "data[0].items[0].quantity",
"expected": 10500.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3100.0,
"actual": null
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": ""
}
] |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-6 | 0 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": ""
},
{
"path": "data[0].items[0].quantity",
"expected": 10500.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3100.0,
"actual": null
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": ""
}
] |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5.2 | 0 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": ""
},
{
"path": "data[0].items[0].quantity",
"expected": 10500.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3100.0,
"actual": null
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": ""
}
] |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5.4 | 0 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": ""
},
{
"path": "data[0].items[0].quantity",
"expected": 10500.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3100.0,
"actual": null
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": ""
}
] |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5-mini | 0 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": ""
},
{
"path": "data[0].items[0].quantity",
"expected": 10500.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3100.0,
"actual": null
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": ""
}
] |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | gemini:gemini-2.5-flash | 0 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": ""
},
{
"path": "data[0].items[0].quantity",
"expected": 10500.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3100.0,
"actual": null
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": ""
}
] |
| so_extraction | 07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | gemini:gemini-2.5-pro | 0 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": ""
},
{
"path": "data[0].items[0].quantity",
"expected": 10500.0,
"actual": 10.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3100.0,
"actual": null
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": ""
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-5 | 0 | 17 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.4 | 0 | 16 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | openai:5.2 | 0 | 16 | [
{
"path": "data[0].items",
"expected_len": 1,
"actual_len": 2
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 6.3
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | sonnet-4-6 | 0 | 16 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 2
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 6.3
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-6 | 0 | 16 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 2
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 6.3
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.json | opus-4-5 | 0 | 16 | [
{
"path": "data",
"expected_len": 1,
"actual_len": 2
},
{
"path": "data[0].items[0].description",
"expected": "BergaPur",
"actual": "Bergapur"
},
{
"path": "data[0].items[0].quantity",
"expected": 6300.0,
"actual": 6.3
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3050.0,
"actual": 3.05
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:5.2 | 0 | 16 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | sonnet-4-6 | 0 | 16 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | sonnet-4-5 | 0 | 16 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:5.4 | 0 | 16 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
},
{
"path": "data[0].items[0].shipment_date",
"expected": "",
"actual": "2026-02-28"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | gemini:gemini-2.5-pro | 0 | 16 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:4.1 | 0 | 16 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-flash | 0 | 16 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | sonnet-4-6 | 0 | 16 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.2 | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-6 | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-6 | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-5 | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-5 | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | gemini:gemini-2.5-flash | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/kg"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | gemini:gemini-2.5-pro | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5-mini | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3250.0,
"actual": 3.25
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/kg"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-6 | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | gemini:gemini-2.5-flash | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | gemini:gemini-2.5-pro | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | openai:4.1 | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | opus-4-5 | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.json | opus-4-6 | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 9000.0,
"actual": 9.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Jakarta",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Jakarta"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | gemini:gemini-2.5-flash | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | gemini:gemini-2.5-pro | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.4 | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5-mini | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/kg"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.4 | 0 | 15 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.4 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-5 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-5 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-6 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:4.1 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.2 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-6 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.2 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5.4 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-5 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | openai:5.2 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-5 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-6 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | opus-4-5 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | gemini:gemini-2.5-flash | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/kg"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | gemini:gemini-2.5-pro | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].unit_price",
"expected": 3500.0,
"actual": 3.5
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/KG"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:4.1 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.2 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-5 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.4 | 0 | 14 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5-mini | 0 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "Soya Lecithin Powder"
},
{
"path": "data[0].items[0].quantity",
"expected": 39000.0,
"actual": 39.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF Nhava Sheva",
"actual": "CIF"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:4.1 | 0 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | sonnet-4-5 | 0 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | opus-4-6 | 0 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.json | openai:5-mini | 0 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOIFNE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 26000.0,
"actual": 26.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "KG",
"actual": "MT"
},
{
"path": "data[0].items[0].ship_term",
"expected": "CIF NHAVA SHEVA",
"actual": "CIF"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Nhava Sheva"
}
] |
| so_extraction | 06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.json | sonnet-4-6 | 0 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "GIIOFINE - P - S",
"actual": "GIIOFINE-P-S"
},
{
"path": "data[0].items[0].quantity",
"expected": 1800.0,
"actual": 1.8
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "EXW"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-pro | 0 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5.2 | 0 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-flash | 0 | 13 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-6 | 0 | 12 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5-mini | 0 | 12 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 46000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | gemini:gemini-2.5-pro | 0 | 11 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:4.1 | 0 | 10 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-5 | 0 | 10 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | opus-4-6 | 0 | 10 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD/MT"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF Busan"
}
] |
| so_extraction | generated_acme_foods_001.json | openai:5.4 | 0 | 10 | [
{
"path": "data[0].items[0].description",
"expected": "Arabica coffee bags",
"actual": ""
},
{
"path": "data[0].items[0].quantity",
"expected": 8.0,
"actual": null
},
{
"path": "data[0].items[0].unit_price",
"expected": 24.0,
"actual": null
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD",
"actual": ""
},
{
"path": "data[0].items[0].shipment_date",
"expected": "2026-05-15",
"actual": ""
}
] |
| so_extraction | 09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.json | openai:5-mini | 0 | 9 | [
{
"path": "data[0].items[0].description",
"expected": "TG - BP102",
"actual": "BP102"
},
{
"path": "data[0].items[0].quantity",
"expected": 23000.0,
"actual": 23.0
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "kg",
"actual": "MT"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "usd/mt",
"actual": "USD"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "CIF"
}
] |
| so_extraction | generated_nova_exports_003.json | gemini:gemini-2.5-pro | 0 | 9 | [
{
"path": "data[0].items[0].description",
"expected": "Espresso blend",
"actual": "Espresso blend bags"
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "",
"actual": "BAGS"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD",
"actual": "USD/BAG"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "urgent dispatch within 48 hours"
},
{
"path": "data[0].items[0].shipment_date",
"expected": "",
"actual": "2026-05-15"
}
] |
| so_extraction | fs_acme_simple.json | openai:5-mini | 0 | 8 | [
{
"path": "data[0].items[0].quantity_unit",
"expected": "BAGS",
"actual": "bags"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/BAG",
"actual": "USD/bag"
},
{
"path": "data[0].items[0].ship_term",
"expected": "FOB",
"actual": ""
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "FOB Singapore",
"actual": ""
},
{
"path": "data[0].items[0].shipping_address",
"expected": "100 Finance Ave",
"actual": ""
}
] |
| so_extraction | realistic_acme_foods_007.json | gemini:gemini-2.5-flash | 0 | 8 | [
{
"path": "data[0].items[0].description",
"expected": "Green coffee beans",
"actual": "Green coffee beans sacks"
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "SACKS",
"actual": "sacks"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/SACK",
"actual": "USD"
},
{
"path": "data[0].items[0].delivery_terms",
"expected": "",
"actual": "9 Harbor Plaza"
},
{
"path": "data[0].items[0].shipment_date",
"expected": "",
"actual": "2026-05-14"
}
] |
| so_extraction | generated_acme_foods_003.json | gemini:gemini-2.5-flash | 0 | 7 | [
{
"path": "data[0].items[0].description",
"expected": "Espresso blend",
"actual": "Espresso blend bags"
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "",
"actual": "bags"
},
{
"path": "data[0].items[0].shipment_date",
"expected": "",
"actual": "2026-05-15"
},
{
"path": "data[0].do_date",
"expected": "",
"actual": "2026-05-15"
},
{
"path": "data[0].po_date",
"expected": "",
"actual": "2026-05-13"
}
] |
| so_extraction | generated_nova_exports_003.json | gemini:gemini-2.5-flash | 0 | 7 | [
{
"path": "data[0].items[0].description",
"expected": "Espresso blend",
"actual": "Espresso blend bags"
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "",
"actual": "bags"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD",
"actual": "USD/bags"
},
{
"path": "data[0].items[0].shipment_date",
"expected": "",
"actual": "2026-05-15"
},
{
"path": "data[0].do_date",
"expected": "",
"actual": "2026-05-15"
}
] |
| so_extraction | realistic_acme_foods_001.json | openai:5.4 | 0 | 7 | [
{
"path": "data[0].items[0].description",
"expected": "Assam tea",
"actual": "Assam tea cartons"
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "cartons",
"actual": "CARTONS"
},
{
"path": "data[0].items[0].unit_price",
"expected": null,
"actual": 19.0
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "",
"actual": "USD/CARTON"
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 285.0
}
] |
| so_extraction | realistic_acme_foods_001.json | gemini:gemini-2.5-flash | 0 | 7 | [
{
"path": "data[0].items[0].description",
"expected": "Assam tea",
"actual": "Assam tea cartons"
},
{
"path": "data[0].items[0].unit_price",
"expected": null,
"actual": 19.0
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "",
"actual": "USD/carton"
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 285.0
},
{
"path": "data[0].po_date",
"expected": "",
"actual": "2026-05-13"
}
] |
| so_extraction | realistic_acme_foods_007.json | openai:5.4 | 0 | 7 | [
{
"path": "data[0].items[0].description",
"expected": "Green coffee beans",
"actual": "Green coffee beans sacks"
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "SACKS",
"actual": ""
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD/SACK",
"actual": ""
},
{
"path": "data[0].items[0].shipment_date",
"expected": "",
"actual": "2026-05-14"
},
{
"path": "data[0].do_date",
"expected": "",
"actual": "2026-05-14"
}
] |
| so_extraction | realistic_acme_foods_009.json | gemini:gemini-2.5-pro | 0 | 7 | [
{
"path": "data[0].items[0].description",
"expected": "Green coffee beans",
"actual": "Green coffee beans sacks"
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "",
"actual": "sacks"
},
{
"path": "data[0].items[0].unit_price",
"expected": null,
"actual": 16.0
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "",
"actual": "USD"
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 1280.0
}
] |
| so_extraction | realistic_nova_exports_005.json | gemini:gemini-2.5-pro | 0 | 7 | [
{
"path": "data[0].items[0].quantity_unit",
"expected": "",
"actual": "PACKS"
},
{
"path": "data[0].items[0].unit_price",
"expected": null,
"actual": 14.0
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "",
"actual": "USD/PACK"
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 350.0
},
{
"path": "data[0].po_date",
"expected": "",
"actual": "2026-05-13"
}
] |
| so_extraction | realistic_nova_exports_007.json | gemini:gemini-2.5-pro | 0 | 7 | [
{
"path": "data[0].items[0].quantity_unit",
"expected": "",
"actual": "PACKS"
},
{
"path": "data[0].items[0].unit_price",
"expected": null,
"actual": 6.0
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "",
"actual": "USD/PACK"
},
{
"path": "data[0].items[0].total",
"expected": null,
"actual": 60.0
},
{
"path": "data[0].po_date",
"expected": "",
"actual": "2026-05-13"
}
] |
| so_extraction | realistic_nova_exports_009.json | gemini:gemini-2.5-pro | 0 | 7 | [
{
"path": "data[0].items",
"expected_len": 1,
"actual_len": 2
},
{
"path": "data[0].items[0].quantity",
"expected": 25.0,
"actual": 12.5
},
{
"path": "data[0].items[0].quantity_unit",
"expected": "",
"actual": "PACKS"
},
{
"path": "data[0].items[0].pricing_unit",
"expected": "USD",
"actual": "USD/PACK"
},
{
"path": "data[0].items[0].total",
"expected": 350.0,
"actual": 175.0
}
] |