Your contention is that models will run on devices; but latent diffusion models have lower memory footprints (see: latent).
The hardware you need to run a good LLM is what, 10x more than a latent diffusion one?
They are not comparable.
Your contention is that models will run on devices; but latent diffusion models have lower memory footprints (see: latent).
The hardware you need to run a good LLM is what, 10x more than a latent diffusion one?
They are not comparable.