1d ago

ZML integrates Zai_org DFlash technique into zml_ai

0

ZML integrated an initial implementation of Zai_org's DFlash draft-token technique into the zml_ai framework. Steeve Morin, founder and CEO of ZML, posted terminal demonstrations running on NVIDIA GeForce RTX 5090 hardware. The runs compiled draft-token variants, loaded weights, generated streaming text, and recorded decode speeds of 206.87 to 286.87 tokens per second with a 0.231 draft acceptance rate over 297 steps.

Original post

Initial @Zai_org's DFlash implementation in @zml_ai (and soon in zml/llmd)

6:29 AM · May 15, 2026 View on X
ZML integrates Zai_org DFlash technique into zml_ai · Digg