Both models trade word-by-word generation for parallel denoising. Only one of them does it without losing intelligence in the ...
Google recently released DiffusionGemma, and it's weird in the best way.