thanks for the explanation! denoising in latent space certainly seems like the "...

dragonwriter · on April 28, 2023

I’m not sure why “denoise in latent space at 64x64 and decode to pixel space at target resolution” is fundamentally better than “denoise in pixel space at 64x64, then upscale to pixel space at target resolution and denoise some more”.

The former seems likely to be lower compute-for-resolution, but that’s not the only consideration for “better”...