Apple's new foundation models are genuinely exciting. The standout is AFM 3 Core Advanced, a 20-billion (!) parameter model that runs entirely on-device.
Read that again. 20-billion, on-device, iPhone 17 Pro.
It pulls this off by keeping the full model in flash memory and loading only a small slice of "experts" into active memory for each prompt, just 1 to 4 billion parameters at a time. That's a clever way to get around the usual DRAM wall, and it's what unlocks things like expressive voices and much sharper dictation right on the device.
The whole family of five models was built in collaboration with Google. It spans these on-device models all the way up to server-based ones running on Private Cloud Compute, with the most demanding cloud model running on NVIDIA GPUs.
Kudos, Apple!


















