What 'private AI' actually means when the model is on the phone
Privacy isn't a checkbox in your settings screen. Here's how on-device inference changes the threat model — and the four places it can still leak.

Every AI app says it cares about privacy. Most of them mean: *we have a privacy policy*. A native AI app with on-device inference can mean something stronger: your prompts and the model's responses never touch a server.
But 'on-device' is not a magic word. There are four places a native AI app can still leak, and you have to design around all of them before you earn the privacy claim.
1. Telemetry. The single most common failure. You ship a beautiful local model, then send a 'user_sent_message' analytics event with the message length, sentiment score, and detected intent. Now you've reconstructed the conversation server-side without the words. The fix: aggregate everything client-side, ship cohort-level metrics on a schedule, never per-event.
2. Crash logs. Your stack trace will cheerfully include the last prompt in a buffer somewhere. Scrub before upload, or — better — keep crash reports fully local and let the user opt-in to share specific ones.
3. The escalation path. When the on-device model bails out and you hit a cloud LLM, the user needs to *know* and *consent* in that moment. A small badge isn't enough. The handoff should feel like a deliberate gear shift, not a silent fallback.
4. The keyboard. Third-party keyboards see every keystroke. You can't fix this for the user, but you can detect it and surface a gentle warning when they're using one for a sensitive conversation.
Get those four right and you have something genuinely defensible. Get any one of them wrong and the on-device story is marketing.