How accurate is Kairo? We benchmarked it — and we're showing our work.
Kairo estimates calories and macros from a photo of your meal. The honest question is: how close is that estimate? So we tested it. On 2026-06-16 we ran 16 internationally diverse meals through Kairo's full production analysis path and scored every result against public nutrition databases.
Result: 16 of 16 cases passed our accuracy gate, with ~96% weighted-macro accuracy across the set. On branded items with a real nutrition label, the mean per-100g calorie error was about 3.5% (median ~2.5%). On generic whole foods, error was higher — 10–12%.
This is a small internal benchmark, not a head-to-head against other apps. Here's exactly how we measured it, every number, and the sources — so you can check us.
Methodology
We run a curated set of 16 hard, realistic meals — international brands, tricky portions, and packaged goods spanning German (de-DE), US (en-US), Japanese (ja-JP), and Italian (it-IT) cuisines — through the same production code path Kairo uses for a real photo scan. We don't grade a special test mode; we grade the app.
- Deterministic weighted-macro scorer — a fixed formula compares estimated calories and macros to ground truth. No human, no LLM judgment in this score.
- LLM user-experience judge — a separate check on whether the result reads as reasonable and useful.
- Citation-presence guard — programmatically verifies that any cited source URL exists and supports the number. A fabricated source cannot pass. This closes the most common failure mode in AI accuracy testing: trusting cited sources at face value.
Pass gate: for an item with a hard nutrition label, the per-100g calorie estimate must be within ±20% of the label. For a composite plate with no single ground truth, it must fall within a realistic calorie band. The whole suite (evals/) is reproducible.
Ground-truth sources
Every number below is measured against a public, checkable source:
- USDA FoodData Central — generic whole foods
- Open Food Facts — branded/packaged products with hard labels
- Published restaurant nutrition — chain menu data
Results — full production path, 2026-06-16
16 / 16 passed · 0 errors · ~96% weighted-macro accuracy
Branded & packaged items — per-100g calorie error vs Open Food Facts
| Item | Per-100g error |
|---|---|
| Kinder Joy | 0.0% |
| Nutella | 0.2% |
| Skyr | 1.5% |
| Kölln Haferflocken (oats) | 3.4% |
| Pocari Sweat | 4.4% |
| Chobani | 11.7% |
| Mean | ~3.5% |
| Median | ~2.5% |
Restaurant item — vs published US nutrition
| Item | Per-100g error |
|---|---|
| McDonald's cheeseburger | 4.2% |
Generic whole foods — per-100g calorie error vs USDA FoodData Central
| Item | Per-100g error |
|---|---|
| Apple | 9.9% |
| Chicken breast | 10.1% |
| Banana | 12.4% |
Reading the results: Kairo is most precise on packaged items with a real label, where retrieval can pin down an exact number, and less precise on generic whole foods, where portion size and natural variation are genuinely ambiguous (a single "banana" varies by more than 30% in the real world). For context, peer-reviewed research puts end-to-end AI photo calorie error around 15–25% — Kairo's whole-food results land at the strong end of that range, and its branded results sit well below it.
What this does and doesn't prove
What it shows: a transparent, reproducible accuracy measurement on a defined, internationally diverse 16-case set, scored deterministically against named public databases (USDA FoodData Central, Open Food Facts, published restaurant nutrition), with a programmatic guard against fabricated citations.
What it does NOT show: that Kairo is "the most accurate app." This is a small (16-case) internal benchmark — not a head-to-head against Cal AI, MyFitnessPal, Cronometer, or any other app, and not a clinical study. We did not test other apps on this set and make no comparative claim. A larger or differently-composed set would shift the averages. Whole-food estimates carry real uncertainty from portion and natural variation.
Who built this
Kairo is built by Valentin Weinert, a software engineer (not a dietitian) who built Kairo's reproducible evaluation pipeline. The reason this page exists: an accuracy claim you can't check isn't worth much. If you run an independent benchmark, we'll share the methodology and ground-truth sources so you can verify the numbers yourself.
Company: Centaurio UG (haftungsbeschränkt) i.G., Germany. Hosting: EU/GDPR.
Scan your meal. See the numbers.
Get Kairo on the App Store