The homework
The real numbers
Marketing pages usually round up. This one is the raw material: what we measured, what we interpolated, and what still runs too hot. Sources and the full research live in the repo — every claim below links to something you can check.
Decode speed, measured
Tokens per second, ~1–1.5B model, 4-bit, stock llama.cpp on CPU (anchors from arXiv 2506.19884); our own pipeline measured ~157–177 tok/s for a 0.5B on an M-series Mac with Metal.
| Chip | 1B | 3–4B | 7–8B | Status |
|---|---|---|---|---|
| A14 (iPhone 12) | 15 tok/s | ~7–9 | not advised | measured |
| A16 (iPhone 15) | 20 tok/s | ~10–13 | tight | measured |
| A17 Pro / A18 Pro | ~24–28 | ~11–13 | ~8 | interpolated |
| Snapdragon 8 Gen 2/3 | ~10–20 | ~6–12 | ~4–8 | measured anchors |
What we’ll say out loud that a landing page usually won’t
- Small models get things wrong. A 1B model is a pocket assistant, not an oracle. The app grades every model’s quality honestly — our lightest build ships labeled Low.
- Phones throttle. Sustained generation heats a phone until the chip slows itself — published studies show 10–44% drops on long runs. We design around it (thermal-aware thread planning) rather than pretend it away.
- Some numbers are interpolated. Where hard data is missing (newest chips, GPU decode), our tables say so, in italics, instead of extrapolating quietly.
- Cloud models are smarter. If you need frontier reasoning and have a connection, use one. Quenderin is for the words you don’t want to hand over and the places the connection doesn’t reach.
Check the homework
The research and the bug history are public — not as a stunt, but because that’s what open source means to us:
- REALITY.md — the honest “can phones actually do this” write-up that seeds the app’s calibration code.
- On-device LLM research — 28 sources, adversarially verified; the refuted claims are listed too.
- Similar projects — who else does this well, and what we learned from each of them.
- The bug journal — every bug we’ve fixed, what caused it, and the lesson. Yes, really.