Harness Run · so_extraction 20260518T192830Z  ·  90 runs  ·  Generated 2026-05-18 19:30 UTC

Structured extraction quality reaches 87% with consistent performance across 90 successful runs

Ten models processed downloaded dataset prompts in under 13 seconds on average, with field match rates ranging from 80% to 92%.

OVERALL QUALITY
87% field match
The benchmark achieved an 87% field match rate across all runs, with models averaging 3 mismatches per expected run. This middle-tier performance suggests room for improvement in extraction accuracy.
TOP PERFORMERS
92% accuracy
gemini:gemini-2.5-pro led with 92% field match and lowest average mismatches (1.9), followed closely by openai:4.1 and opus-4-6 at 91%. These three models demonstrate the current quality ceiling for this extraction task.
SPEED LEADER
4.6 seconds
sonnet-4-6 delivered the fastest average runtime at 4.6 seconds while maintaining 91% field match. The slowest model (openai:5-mini at 33 seconds) also showed the weakest accuracy at 80%, highlighting a clear speed-quality tradeoff.
Sec. 01

Results by dataset

This benchmark evaluated structured extraction performance on a single dataset containing prompts with expected field outputs.

downloaded
87.09%
Field match
90 runs3.00 mismatch
Sec. 02

Quality vs. speed

The quality-speed frontier shows clear clustering: fastest models (sonnet and opus families) complete runs in under 7 seconds, while top-accuracy models balance speed and precision.

Model frontier
Each bubble is one row in the table below (model × few-shot). Bubble size reflects mismatch load.
Sec. 03

Leaderboard

All ten models achieved 100% success rates with consistent performance across four few-shot configurations, processing 9 runs each.

0 rows
Model FS Runs Avg s Mismatch Field match
Sec. 04

Few-shot sweep

Few-shot performance remained stable across all models, with no configuration showing systematic advantages.

FS 4
87.09%

The 2.1 mismatch standard deviation indicates **minimal few-shot sensitivity**, suggesting prompt engineering may have limited impact on these models' extraction capabilities.

Sec. 05

What to check next

Focus efforts on closing the 13-point gap between top and bottom performers while preserving sub-second latency advantages.

01
Deploy gemini-2.5-pro or openai:4.1 for production
These models offer the best field match rates (92% and 91%) at reasonable speeds (17 and 12 seconds), making them suitable for accuracy-critical extraction workflows.
02
Investigate openai:5-mini underperformance
At 80% field match and 33-second runtime, this model lags significantly. Determine if configuration issues or inherent limitations explain the 12-point accuracy gap versus leading models.
03
Optimize for sonnet-4-6 in latency-sensitive contexts
With 91% accuracy in just 4.6 seconds, this model delivers the best speed-quality balance for applications where sub-5-second response times matter.
04
Analyze the 3-mismatch baseline
With an average of 3 field mismatches per run, identify which specific fields drive errors. Targeted prompt refinement or post-processing could meaningfully improve the 87% baseline.
05
Test additional few-shot strategies
The stable performance across current few-shot configurations suggests exploring alternative prompting techniques (chain-of-thought, role-playing) may yield better returns than simply varying example counts.
Run configuration (JSON)
{
  "agent": "so_extraction",
  "pipeline": null,
  "models": [
    "sonnet-4-5",
    "sonnet-4-6",
    "opus-4-5",
    "opus-4-6",
    "openai:4.1",
    "openai:5.2",
    "openai:5-mini",
    "openai:5.4",
    "gemini:gemini-2.5-pro",
    "gemini:gemini-2.5-flash"
  ],
  "datasets": [
    "downloaded"
  ],
  "chat": null,
  "chats_glob": null,
  "bulk": false,
  "runs_per_chat": 1,
  "max_workers": 20,
  "few_shot_explicit": [
    "/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/multiple_product_multiple_shipment_medium.json",
    "/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/single_product_multiple_shipment_complex.json",
    "/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/single_product_single_shipment_medium.json",
    "/Users/tripathipranav/Documents/code/harness_agents/raw_data/chats/updates/update_change_payment_terms.json"
  ],
  "few_shot_walk": [],
  "few_shot_sweep": [],
  "few_shot_pool_argv": [],
  "few_shot_seed": 42,
  "db_few_shot_limit": 0,
  "skip_without_expected": true,
  "results_dir": "/Users/tripathipranav/Documents/code/harness_agents/results/20260518T192830Z",
  "config_file": "configs/agents.json",
  "few_shot_mode": "explicit",
  "few_shot_pool_size": 68,
  "few_shot_default_pool_size": 68,
  "few_shot_pool_override": null,
  "few_shot_variants": [
    {
      "label": "explicit",
      "count": 4,
      "paths": [
        "raw_data/chats/multiple_product_multiple_shipment_medium.json",
        "raw_data/chats/single_product_multiple_shipment_complex.json",
        "raw_data/chats/single_product_single_shipment_medium.json",
        "raw_data/chats/updates/update_change_payment_terms.json"
      ]
    }
  ],
  "allow_self_fewshot": false
}
How to read these numbers

SUCCESS RATE — Share of runs that finished without a harness or HTTP error. High success means the run was stable; it does not prove the answers matched the reference.

AVG ELAPSED (S) — Average wall time per run in that bucket. Useful for latency comparisons.

AVG MISMATCH / EXPECTED RUN — Average count of fields that differed from the golden JSON when a reference existed. Lower is better.

FIELD MATCH — Fraction of compared fields that matched the golden output across runs in that bucket. Higher is better.

Sample mismatches (up to 80 rows)
AgentChatModelFSMismatchesSample
so_extraction03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonopenai:5.4412
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3.05,
    "actual": null
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/KG",
    "actual": ""
  },
  {
    "path": "data[0].items[0].ship_term",
    "expected": "EXW",
    "actual": ""
  },
  {
    "path": "data[0].items[0].delivery_terms",
    "expected": "EXW",
    "actual": ""
  },
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2026-03-31",
    "actual": ""
  }
]
so_extraction03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonopenai:5-mini48
[
  {
    "path": "data[0].items[0].delivery_terms",
    "expected": "EXW",
    "actual": "EXW Japan"
  },
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  },
  {
    "path": "data[0].items[0].total",
    "expected": null,
    "actual": 32025.0
  },
  {
    "path": "data[0].do_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  },
  {
    "path": "data[0].po_ref_no",
    "expected": "4520000944",
    "actual": ""
  }
]
so_extraction08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopus-4-548
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2025-11-15",
    "actual": "2026-11-15"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": ""
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40' FCL"
  },
  {
    "path": "data[0].do_date",
    "expected": "2025-11-15",
    "actual": "2026-11-15"
  },
  {
    "path": "data[0].po_date",
    "expected": "2025-09-29",
    "actual": ""
  }
]
so_extraction09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopus-4-548
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2025-11-15",
    "actual": "2026-11-15"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": ""
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40' FCL"
  },
  {
    "path": "data[0].do_date",
    "expected": "2025-11-15",
    "actual": "2026-11-15"
  },
  {
    "path": "data[0].po_date",
    "expected": "2025-09-29",
    "actual": ""
  }
]
so_extraction08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopenai:5-mini47
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD"
  },
  {
    "path": "data[0].items[0].delivery_terms",
    "expected": "CIF Busan",
    "actual": ""
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40’ FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  }
]
so_extraction09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopenai:5-mini47
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD"
  },
  {
    "path": "data[0].items[0].delivery_terms",
    "expected": "CIF Busan",
    "actual": "CIF"
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40’ FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  }
]
so_extraction01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonopenai:5-mini46
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/KG",
    "actual": "USD/kg"
  },
  {
    "path": "data[0].items[0].delivery_terms",
    "expected": "EXW",
    "actual": "EXW cargo ready by March 2026"
  },
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2027-03-01",
    "actual": ""
  },
  {
    "path": "data[0].do_date",
    "expected": "2027-03-01",
    "actual": ""
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]
so_extraction04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.jsonopenai:5.446
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3250.0,
    "actual": 3.25
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD/KG"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Jakarta",
    "actual": ""
  },
  {
    "path": "data[0].items[0].total",
    "expected": 29250.0,
    "actual": null
  },
  {
    "path": "data[0].po_ref_no",
    "expected": "PO-IMP-BIB-2601-017",
    "actual": ""
  }
]
so_extraction08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopenai:5.246
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD"
  },
  {
    "path": "data[0].items[0].packing",
    "expected": "25kg printed paper bag",
    "actual": "25kg printed paper bags"
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40’ FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "FeedBEST Company Limited, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": ""
  }
]
so_extraction01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonopenai:5.245
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].items[0].total",
    "expected": 5850.0,
    "actual": 5.8500000000000005
  },
  {
    "path": "data[0].do_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  },
  {
    "path": "data[0].shipping_method",
    "expected": "Collection Against OPO 260012/EC",
    "actual": ""
  }
]
so_extraction03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsongemini:gemini-2.5-flash45
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/KG",
    "actual": "USD/kg"
  },
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Japan",
    "actual": ""
  },
  {
    "path": "data[0].items[0].total",
    "expected": null,
    "actual": 32025.0
  },
  {
    "path": "data[0].do_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  }
]
so_extraction03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonopenai:5.245
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  },
  {
    "path": "data[0].items[0].total",
    "expected": null,
    "actual": 32.025
  },
  {
    "path": "data[0].do_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  },
  {
    "path": "data[0].shipping_method",
    "expected": "Collection",
    "actual": "Collection against PO 4520000944"
  }
]
so_extraction04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.jsonopenai:5-mini45
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3250.0,
    "actual": 3.25
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD/KG"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Jakarta",
    "actual": ""
  },
  {
    "path": "data[0].po_ref_no",
    "expected": "PO-IMP-BIB-2601-017",
    "actual": ""
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]
so_extraction08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonsonnet-4-645
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": ""
  },
  {
    "path": "data[0].po_date",
    "expected": "2025-09-29",
    "actual": ""
  },
  {
    "path": "data[0].po_ref_no",
    "expected": "BP102-2025-1",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "FeedBEST Company Limited, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": ""
  },
  {
    "path": "data[0].shipping_address",
    "expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": ""
  }
]
so_extraction08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonsonnet-4-545
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2025-11-15",
    "actual": "2026-11-15"
  },
  {
    "path": "data[0].do_date",
    "expected": "2025-11-15",
    "actual": "2026-11-15"
  },
  {
    "path": "data[0].po_date",
    "expected": "2025-09-29",
    "actual": ""
  },
  {
    "path": "data[0].po_ref_no",
    "expected": "BP102-2025-1",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "FeedBEST Company Limited, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": "Leonardo da Vinci, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea"
  }
]
so_extraction08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopenai:5.445
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD"
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40’ FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].billing_address",
    "expected": "FeedBEST Company Limited, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": ""
  },
  {
    "path": "data[0].shipping_address",
    "expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": ""
  }
]
so_extraction08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsongemini:gemini-2.5-flash45
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD"
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40\tFCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  },
  {
    "path": "data[0].delivery_terms",
    "expected": "CIF Busan",
    "actual": "CIF"
  }
]
so_extraction09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonsonnet-4-545
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2025-11-15",
    "actual": "2026-11-15"
  },
  {
    "path": "data[0].do_date",
    "expected": "2025-11-15",
    "actual": "2026-11-15"
  },
  {
    "path": "data[0].po_date",
    "expected": "2025-09-29",
    "actual": ""
  },
  {
    "path": "data[0].po_ref_no",
    "expected": "BP102-2025-1",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "FeedBEST Company Limited, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": "Leonardo da Vinci, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea"
  }
]
so_extraction09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopenai:5.245
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD"
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40’ FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  },
  {
    "path": "data[0].shipping_method",
    "expected": "",
    "actual": "Unknown"
  }
]
so_extraction01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonsonnet-4-644
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3.25,
    "actual": 3250.0
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/KG",
    "actual": "USD/MT"
  },
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2027-03-01",
    "actual": "2026-03-01"
  },
  {
    "path": "data[0].do_date",
    "expected": "2027-03-01",
    "actual": "2026-03-01"
  }
]
so_extraction01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonopenai:5.444
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].items[0].total",
    "expected": 5850.0,
    "actual": null
  },
  {
    "path": "data[0].do_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]
so_extraction03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonopus-4-544
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Japan",
    "actual": ""
  },
  {
    "path": "data[0].items[0].total",
    "expected": null,
    "actual": 32025.0
  },
  {
    "path": "data[0].do_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  }
]
so_extraction03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonopus-4-644
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Japan",
    "actual": ""
  },
  {
    "path": "data[0].items[0].total",
    "expected": null,
    "actual": 32025.0
  },
  {
    "path": "data[0].do_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  }
]
so_extraction04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.jsonopenai:5.244
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3250.0,
    "actual": 3.25
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD/KG"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Jakarta",
    "actual": ""
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]
so_extraction06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonopenai:5-mini44
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/KG",
    "actual": "USD/kg"
  },
  {
    "path": "data[0].items[0].delivery_terms",
    "expected": "EXW",
    "actual": ""
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  },
  {
    "path": "data[0].delivery_terms",
    "expected": "EXW",
    "actual": ""
  }
]
so_extraction08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsongemini:gemini-2.5-pro44
[
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40’ FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  },
  {
    "path": "data[0].shipping_method",
    "expected": "",
    "actual": "Unknown"
  }
]
so_extraction09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopenai:4.144
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD"
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40’ FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  }
]
so_extraction09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopenai:5.444
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD"
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40’ FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  }
]
so_extraction09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsongemini:gemini-2.5-flash44
[
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40\nThe Shipment Terms can have only these values \"EXW\", \"FOB\", \"CIF\", \"DDP\" (find approriate value for the Shipment Terms from chat messages). If not found it should be "
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  },
  {
    "path": "data[0].delivery_terms",
    "expected": "CIF Busan",
    "actual": "CIF"
  }
]
so_extraction01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonsonnet-4-543
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].do_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].billing_address",
    "expected": "",
    "actual": "Leonardo da Vinci"
  }
]
so_extraction01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonopus-4-543
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].do_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].po_ref_no",
    "expected": "",
    "actual": "OPO 260012/EC"
  }
]
so_extraction01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsongemini:gemini-2.5-pro43
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].do_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].po_ref_no",
    "expected": "",
    "actual": "OPO 260012/EC"
  }
]
so_extraction02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsongemini:gemini-2.5-pro43
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
  },
  {
    "path": "data[0].billing_address",
    "expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
    "actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
  },
  {
    "path": "data[0].shipping_address",
    "expected": "",
    "actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
  }
]
so_extraction02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopenai:5.443
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
    "actual": ""
  }
]
so_extraction02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopenai:5.243
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
    "actual": ""
  }
]
so_extraction02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopenai:5-mini43
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
    "actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist \t6 Satara, Maharashtra \t6 412803"
  }
]
so_extraction03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonsonnet-4-543
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Japan",
    "actual": ""
  },
  {
    "path": "data[0].items[0].total",
    "expected": null,
    "actual": 32025.0
  },
  {
    "path": "data[0].billing_address",
    "expected": "",
    "actual": "Leonardo da Vinci"
  }
]
so_extraction03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonopenai:4.143
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Japan",
    "actual": ""
  },
  {
    "path": "data[0].items[0].total",
    "expected": null,
    "actual": 32025.0
  },
  {
    "path": "data[0].shipping_method",
    "expected": "Collection",
    "actual": ""
  }
]
so_extraction04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.jsonsonnet-4-543
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3250.0,
    "actual": 3.25
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD/KG"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Jakarta",
    "actual": ""
  }
]
so_extraction04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.jsonopus-4-643
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3250.0,
    "actual": 3.25
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD/KG"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Jakarta",
    "actual": ""
  }
]
so_extraction04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.jsonopus-4-543
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3250.0,
    "actual": 3.25
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD/KG"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Jakarta",
    "actual": ""
  }
]
so_extraction04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.jsongemini:gemini-2.5-flash43
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3250.0,
    "actual": 3.25
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD/KG"
  },
  {
    "path": "data[0].delivery_terms",
    "expected": "CIF Jakarta",
    "actual": "CIF"
  }
]
so_extraction04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.jsonopenai:4.143
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3250.0,
    "actual": 3.25
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD/KG"
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Jakarta",
    "actual": ""
  }
]
so_extraction05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonsonnet-4-543
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "",
    "actual": "13MT/20'FCL"
  },
  {
    "path": "data[0].billing_address",
    "expected": "",
    "actual": "Leonardo da Vinci, "
  }
]
so_extraction05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopenai:5-mini43
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": ""
  },
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]
so_extraction08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopenai:4.143
[
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40’ FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  }
]
so_extraction08__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopus-4-643
[
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40' FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  }
]
so_extraction09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonsonnet-4-643
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "FeedBEST Company Limited, Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": ""
  },
  {
    "path": "data[0].shipping_address",
    "expected": "Factory 354-58 Mojeon-1 Sobuk-gu Republic of Korea",
    "actual": ""
  }
]
so_extraction09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsonopus-4-643
[
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40' FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  }
]
so_extraction09__2025-09-29__120363403592950429_g_us__d586d853-694c-42f9-93be-bc7ba5b2110c.jsongemini:gemini-2.5-pro43
[
  {
    "path": "data[0].items[0].loading",
    "expected": "23MT/40'FCL",
    "actual": "23 MT / 40’ FCL"
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": "AG Lipids Pte Ltd"
  },
  {
    "path": "data[0].payment_date",
    "expected": "",
    "actual": "Net 14 Days"
  }
]
so_extraction01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonopus-4-642
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].do_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  }
]
so_extraction01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsongemini:gemini-2.5-flash42
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2027-03-01",
    "actual": ""
  },
  {
    "path": "data[0].do_date",
    "expected": "2027-03-01",
    "actual": ""
  }
]
so_extraction01__2026-02-24__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonopenai:4.142
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  },
  {
    "path": "data[0].do_date",
    "expected": "2027-03-01",
    "actual": "2026-03-31"
  }
]
so_extraction02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonsonnet-4-642
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
    "actual": ""
  }
]
so_extraction02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonsonnet-4-542
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
    "actual": "Leonardo da Vinci"
  }
]
so_extraction02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopus-4-542
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
    "actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
  }
]
so_extraction02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopus-4-642
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
    "actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
  }
]
so_extraction02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsongemini:gemini-2.5-flash42
[
  {
    "path": "data[0].delivery_terms",
    "expected": "CIF Nhava Sheva",
    "actual": "CIF"
  },
  {
    "path": "data[0].billing_address",
    "expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
    "actual": ""
  }
]
so_extraction02__2026-02-09__120363426578757754_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopenai:4.142
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].billing_address",
    "expected": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist Satara, Maharashtra - 412803",
    "actual": "GIIAVA (India) Pvt. Ltd., Plot No. C3, Wai MIDC, Dist – Satara, Maharashtra – 412803"
  }
]
so_extraction03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonsonnet-4-642
[
  {
    "path": "data[0].items[0].shipment_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  },
  {
    "path": "data[0].do_date",
    "expected": "2026-03-31",
    "actual": "2026-03-15"
  }
]
so_extraction04__2026-01-29__120363408498669191_g_us__4b9c2faa-94dd-4236-abcc-398807051f21.jsongemini:gemini-2.5-pro42
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3250.0,
    "actual": 3.25
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/MT",
    "actual": "USD/KG"
  }
]
so_extraction05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopus-4-542
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].items[0].loading",
    "expected": "",
    "actual": "13MT/20'FCL"
  }
]
so_extraction05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopenai:5.442
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]
so_extraction05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopenai:5.242
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]
so_extraction05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsongemini:gemini-2.5-flash42
[
  {
    "path": "data[0].delivery_terms",
    "expected": "CIF Nhava Sheva",
    "actual": "CIF"
  },
  {
    "path": "data[0].billing_address",
    "expected": "",
    "actual": "Leonardo da Vinci"
  }
]
so_extraction06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonopenai:5.442
[
  {
    "path": "data[0].items[0].total",
    "expected": 6300.0,
    "actual": null
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]
so_extraction06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonsonnet-4-642
[
  {
    "path": "data[0].items[0].unit_price",
    "expected": 3.5,
    "actual": 3500.0
  },
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/KG",
    "actual": "USD/MT"
  }
]
so_extraction06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsonopenai:5.242
[
  {
    "path": "data[0].items[0].total",
    "expected": 6300.0,
    "actual": 6.3
  },
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]
so_extraction03__2026-01-30__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsongemini:gemini-2.5-pro41
[
  {
    "path": "data[0].items[0].total",
    "expected": null,
    "actual": 32025.0
  }
]
so_extraction05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonsonnet-4-641
[
  {
    "path": "data[0].items[0].loading",
    "expected": "",
    "actual": "13MT/20'FCL"
  }
]
so_extraction05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopenai:4.141
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  }
]
so_extraction05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsonopus-4-641
[
  {
    "path": "data[0].items[0].shipping_address",
    "expected": "Nhava Sheva",
    "actual": ""
  }
]
so_extraction05__2026-01-20__120363407382355715_g_us__12a4f3a7-d506-4d32-ae06-3f76508c6abd.jsongemini:gemini-2.5-pro41
[
  {
    "path": "data[0].items[0].loading",
    "expected": "",
    "actual": "13MT/20'FCL"
  }
]
so_extraction06__2026-01-06__120363421131250401_g_us__e05574ec-b110-4554-9fc3-3abb4f9011a8.jsongemini:gemini-2.5-flash41
[
  {
    "path": "data[0].items[0].pricing_unit",
    "expected": "USD/KG",
    "actual": "USD/kg"
  }
]
so_extraction07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonopenai:4.141
[
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]
so_extraction07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonsonnet-4-541
[
  {
    "path": "data",
    "expected_len": 1,
    "actual_len": 0
  }
]
so_extraction07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonopus-4-541
[
  {
    "path": "data",
    "expected_len": 1,
    "actual_len": 0
  }
]
so_extraction07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonopenai:5.441
[
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]
so_extraction07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonopus-4-641
[
  {
    "path": "data",
    "expected_len": 1,
    "actual_len": 0
  }
]
so_extraction07__2025-12-23__120363403074656566_g_us__8f477a8f-2a60-4e0a-bf0e-8cc3cdf1dc9f.jsonopenai:5.241
[
  {
    "path": "data[0].vendor_name",
    "expected": "Van Beethoven",
    "actual": ""
  }
]