Still astonished by the gap between Z-Image Turbo (which I only saw as a fast workhorse) and the full model. It's genuinely extraordinary in its aesthetic richness. Made me remember the good early days. I really hope they cook up a Z-Image 2, fully unified, bigger, stronger.
Z-Image Delivers Superior Aesthetics But Weaker Instruction Following Than Qwen-Image
Most Activity
same prompt, same everything, similar-sized models. One tech report has 8 pages on data engineering, the other has 1 page with crap like "uhh safety filters". Only one image can make you feel anything. This is why I say that China could do well to build a serious data market.
Z-Image (left) vs HiDream-O1 Full (right) I hope you see my point TASTE vs NO TASTE tbh I could rescue HiDream with negative prompting… maybe.
Z-Image (left) vs HiDream-O1 Full (right) I hope you see my point TASTE vs NO TASTE tbh I could rescue HiDream with negative prompting… maybe.
yeah it's small and dumb and relies on qwen encoder (I'm not even sure if it's the right one, used what was on the disk). It borks up anatomy, logic. But the palette it wields. The textures it can render. It is a thing capable of art more than GPT-Image-2 is.
billion trillion tokens of multimodal data from step 1 a real turning point for open source AI. It is not a toy—it's a serious attempt to build the foundation for long-horizon agents that can actually work across code, docs, screens, and real workflows. delving point, even.
same prompt, same everything, similar-sized models. One tech report has 8 pages on data engineering, the other has 1 page with crap like "uhh safety filters". Only one image can make you feel anything. This is why I say that China could do well to build a serious data market.