Comprehensive analysis of 26 AI models for receipt processing
Models Tested
Test Receipts
Total Actual Cost
100% Success Rate
Total Time
| Rank | Model | Success | Quality | Actual Cost (per receipt) | Speed | Tokens (P/C) |
|---|---|---|---|---|---|---|
| 1 | Sherlock Think Alpha | 100% | 100% | FREE | 14.01s | 603 / 786 (⚡755 reasoning) |
| 2 | OpenAI GPT-5.1 | 100% | 100% | $0.00406 | 11.03s | 1,073 / 272 (⚡219 reasoning) |
| 3 | OpenAI GPT-5.1 Chat | 100% | 100% | $0.00204 | 4.94s | 1,073 / 70 (⚡32 reasoning) |
| 4 | OpenAI GPT-5.1-Codex | 100% | 100% | $0.00271 | 8.62s | 1,073 / 137 (⚡85 reasoning) |
| 5 | OpenAI GPT-5.1-Codex-Mini | 100% | 100% | $0.00066 | 4.88s | 1,803 / 106 (⚡64 reasoning) |
| 6 | Amazon Nova Premier 1.0 | 100% | 100% | $0.00665 | 6.99s | 2,335 / 65 |
| 7 | Perplexity Sonar Pro Search | 100% | 100% | $0.01925 | 6.16s | 183 / 47 |
| 8 | NVIDIA Nemotron (free) | 100% | 100% | FREE | 9.15s | 2,378 / 446 (⚡381 reasoning) |
| 9 | NVIDIA Nemotron Nano 12B 2 VL | 100% | 100% | $0.00037 | 6.00s | 2,378 / 446 (⚡381 reasoning) |
| 10 | Anthropic Claude Haiku 4.5 | 100% | 100% | $0.00198 | 2.89s ⚡ | 1,658 / 63 |
| 11 | Qwen3 VL 8B Thinking | 100% | 100% | $0.00130 | 11.30s | 1,634 / 480 (⚡445 reasoning) |
| 12 | Qwen3 VL 8B Instruct | 100% | 100% | $0.00042 💎 | 5.83s | 2,698 / 52 |
| 13 | Anthropic Claude 3.5 Sonnet | 100% | 100% | $0.00585 | 4.62s | 1,658 / 58 |
| 14 | Sherlock Dash Alpha | 100% | 97.9% | FREE 🥇 | 3.04s | 615 / 33 |
| 15 | Google Gemini 3 Pro Preview | 100% | 97.9% | $0.02156 | 22.35s | 1,279 / 1,584 (⚡1,546 reasoning) |
| 16-26 | 11 models returned 404 errors (not available on OpenRouter) | |||||
Tested on diverse receipt formats: