
Model & Prompt Performance
Compare models and prompt versions across accuracy metrics to find optimization opportunities.
Time range
Customer
Showing data from Oct 30 – Nov 29, 2025
Best model
Model with highest field accuracy in the selected period.
Claude
94.2% field accuracy Top performer
Most cost-efficient
Model with best accuracy-to-cost ratio.
Claude
€0.0045/msg Best value
Current prompt
Active prompt version and its field accuracy.
v2.1
94.8% accuracy +6.6 pp
since v1.0Models evaluated
Number of models being tracked for comparison.
4
in comparison Active
Model performance comparison
How do models compare across multiple accuracy metrics?
Claude 3.5 Sonnet
GPT-4o Mini
Claude 3 Haiku
Gemini 1.5 Pro
Accuracy vs cost per model
Which models give the best accuracy for the cost?
Claude 3.5 Sonnet
GPT-4o Mini
Claude 3 Haiku
Gemini 1.5 Pro
Prompt version accuracy over time
Did new prompt versions improve or hurt accuracy when deployed?
Field Accuracy
| Dashed lines = version changesAccuracy by field group
For specific field groups, which model is best?
Addresses
Claude
91.2%
GPT-4o
87.5%
Claude
84.2%
Gemini
89.1%
Dates
Claude
96.5%
GPT-4o
94.2%
Claude
91.8%
Gemini
95.2%
Quantities
Claude
93.8%
GPT-4o
90.5%
Claude
87.2%
Gemini
92.1%
References
Claude
89.5%
GPT-4o
85.8%
Claude
82.5%
Gemini
87.8%
Claude 3.5 Sonnet
GPT-4o Mini
Claude 3 Haiku
Gemini 1.5 Pro