EngineeringApril 9, 2026· 6 min read

Why Swift is quietly becoming a great AI runtime

Strict concurrency, value semantics, and Metal Performance Shaders make Swift surprisingly pleasant for ML glue code on iOS.

I came into this project assuming I'd be writing a pile of Objective-C++ to bridge between Swift and the model runtime. Six months later, I've written maybe 40 lines of it.

Swift 6's strict concurrency model turns out to be a gift for inference pipelines. Actors give you a clean way to serialize access to a model context without manual locks, and `AsyncSequence` makes streaming tokens trivially composable with the rest of your UI code.

Combine that with Metal Performance Shaders Graph for the parts you really do want to hand-tune, and you get a stack that's *fast enough* and *legible*. That second word is doing a lot of work — I can hand a new contributor the inference module and they can read it top to bottom.

Value types pay off in surprising places. Token batches, attention masks, sampling configs — all structs. No accidental aliasing, no defensive copies, and Sendable conformance falls out for free. The whole inference pipeline is a series of pure functions over immutable data, with the actor only holding the KV-cache.

The one rough edge: tokenizers. The fastest implementations are still in Rust or C++, and the Swift ports lag behind. I ended up shipping a small Rust tokenizer via `swift-bridge`, which is a story for another post.

Why Swift is quietly becoming a great AI runtime

Keep reading

What is a native AI app, and why does it matter in 2026?

Shipping an on-device LLM without melting the phone