Open source · every token computed in your hand

An AI that lives on your phone—not in someone’s cloud.

Quenderin downloads an open model (0.4–4.7 GB, your pick) and runs every token locally. We’ve measured ~15 tok/s on an iPhone 12 — quick enough to be useful, and honest enough to warn you when a model won’t fit. After the download, nothing you type touches a network.

View on GitHub See how it works

15 tok/s iPhone 12, measured
0.4–4.7 GB one download
0 network calls after setup
MIT every line public

The actual Quenderin app answering a Python question fully on-device, with syntax-highlighted code — A real screenshot — this exact conversation ran with the network off.

Local inference, done right

Local GGUF models
node-llama-cpp / llama.cpp
Apple Silicon → Raspberry Pi
MIT open source

We built this because our assistants kept vanishing with the signal.

A mainstream assistant is a thin client for a data center: lose the connection and it goes dark; keep it and your words travel to hardware you’ll never see. Quenderin is the model itself, living on the device in your hand. It’s smaller than the cloud ones and it will sometimes be wrong — we publish exactly how much smaller. It is also always there, and it is only yours.

How it works

One download. Then it’s just yours.

01
Probe
Quenderin reads your hardware—RAM, chip, GPU—and recommends the largest model your device can run well.
02
Download
Pull the model once over Wi-Fi—0.4 to 4.7 GB, your choice. It resumes in the background if you get interrupted.
03
Go offline
Everything runs locally from then on. A clear check tells you it’s safe before you leave the grid.

Quenderin's model picker: a recommended model for this device on top, every other model badged Fits, Tight, or Too big — The picker checks your RAM before offering anything — models that don’t fit are disabled, with the reason.

Straight from the app

Not mockups. These are renders of the real UI.

Every image on this page is generated from the shipping SwiftUI code — when the app changes, the site re-renders with it.

Quenderin's first-launch welcome screen: private by design, works offline, open source — First launch — three promises, then you’re in.

Every model explained in plain language — no jargon assumed.

The fitness-aware model picker — The catalog knows your hardware — Fits, Tight, or Too big, honestly.

The agent's run log: numbered tool calls (unit converter, calculator) and the final answer — The agent shows its work — every tool call in the run log, and it can never touch payments, deletion, or credentials.

Features

A real assistant, with nothing phoning home.

Truly offline

After the one-time download, there are zero network calls. Works at 35,000 feet or three days into a hike.

Airplane mode · still answering

Free, structurally

Not a promotion — there is no server to pay for. It runs on hardware you already own.

Does things, safely

A tool-using agent with a hard safety blocklist—it will never autonomously touch:

PayDeletePasswordTransfer

Private by design

No accounts, no analytics, no telemetry. Your conversations are yours.

Hardware-adaptive

It picks a model that fits—and warns you before one that doesn’t.

PiM-series

Open source

MIT licensed. Read every line, audit every claim, build it yourself.

Resumable downloads

The model keeps downloading in the background—even if you switch apps the night before a trip.

Made for

When the cloud isn’t an option.

Off the grid

Planes, trains through tunnels, trails, ships, field work — anywhere a signal can’t follow you, your assistant still does.

Private by necessity

Legal, medical, journalism, research — work that simply can’t be sent to someone else’s server. Nothing leaves the device.

Free for everyone

Students, hobbyists, the curious — a capable assistant with no subscription, no token meter, no account to create.

Built for the edge of the map

You’re going off the grid for three days.

The night before, on hotel Wi-Fi, you download a model—and Quenderin keeps the download alive even when you switch apps, then shows a green “ready to go offline”. On the trail, with no bars, you still have a capable assistant in your pocket. That’s the whole point: an AI that doesn’t need the world to be online.

Privacy

Your data never had to leave. So it doesn’t.

No cloud calls after the model is downloaded
No API keys, no sign-in, no account
No analytics, no tracking, no telemetry
Conversations stored only on your device
Open source — verify all of the above yourself

Read the full Privacy Policy →

Model catalog

Bring any model. Right-sized for your hardware.

Quenderin runs on llama.cpp—so it runs any GGUF model: Llama, Qwen, DeepSeek, Mistral, Gemma, Phi. Here’s the curated shortlist it recommends from, sized for everything from a Raspberry Pi to an M-series Mac.

Model	Good for	Download	Min RAM	Quant
Qwen3 4BRecommended	General-purpose, Apache 2.0 — the current go-to	2.4 GB	4 GB	Q4_K_M
DeepSeek-R1 7B	Step-by-step reasoning & math	4.7 GB	8 GB	Q4_K_M
Qwen2.5 Coder 7B	Code generation & tool use	4.7 GB	8 GB	Q4_K_M
Gemma 3 4B	Multilingual, 140+ languages	2.5 GB	4 GB	Q4_K_M
Phi-4 mini 3.8B	Efficient, runs well on CPU	2.3 GB	4 GB	Q4_K_M
Mistral 7B	Fast, capable all-rounder	4.1 GB	6 GB	Q4_K_M
Llama 3.2 1B	Ultra-light — runs on a Pi	0.8 GB	1.5 GB	Q4_K_M
Qwen3 14B	Best quality for a strong device	9.0 GB	12 GB	Q4_K_M

+ thousands more from Hugging Face—any GGUF works. Choosing is optional: Quenderin picks the best fit for your device automatically.

FAQ

Questions, answered.

It mostly comes down to one idea: the model runs on your device, so nothing leaves it. Here’s the rest.

Still have a question?

Ask on GitHub

Is it really free?

Yes. Quenderin is MIT-licensed open source, and inference runs on your own hardware—so there are no token costs and no subscription.

Is my data sent anywhere?

No. After the one-time model download, Quenderin makes no network calls. There are no analytics, no accounts, and no telemetry. Because it’s open source, you can verify this yourself.

What devices does it run on?

The desktop app runs today on macOS and Linux. Native iOS and Android apps are in active development—the same hardware-detection and model-selection engine, rebuilt natively for performance.

Do I need a powerful phone?

No. Quenderin scales from a Raspberry Pi to an M-series Mac. It detects what your device can handle and recommends a model that actually runs—down to a 0.4 GB ultra-light build.

When can I install it?

It’s open source and runnable from GitHub today. The polished mobile apps are on the way—star or watch the repo on GitHub to hear the moment they land.

Run it today

Up and running in four lines.

It’s open source and runs from GitHub right now — on macOS or Linux.

git clone https://github.com/alikatgh/quenderin
cd quenderin
npm install
npm run electron:dev

macOS & Linux today · iOS & Android in active development

Bring your AI offline.

It’s open source and runs from GitHub today. Star the repo to follow along—and watch releases to hear the moment the mobile apps land.

Star on GitHub Watch releases →