It's copy-pasting parts of the training images over and over.
In figure 8 of the technical report [0], compare the hair in images (0,0), (2,0), (3,0), (3,3), (4,4).
The paper suggests their method generates copyright-free images, yet they are very obviously derived from the input images and you can identify the parts of individual input images that are mashed together to form the output.
All in all their method seems to be performing "obfuscated memorization," in the sense that the generated images are scrambled up enough to fool their plagiarsim-detector loss function.
But as the online article states, that figure represents a case where the model is explicitly set to "generate images [which] have similar major visual features with different attribute combinations": http://make.girls.moe/#/news
So some degree of repetition is to be expected, since you've turned off random noise. And despite that the images do still exhibit some variation if you look closely.
In figure 8 of the technical report [0], compare the hair in images (0,0), (2,0), (3,0), (3,3), (4,4).
The paper suggests their method generates copyright-free images, yet they are very obviously derived from the input images and you can identify the parts of individual input images that are mashed together to form the output.
All in all their method seems to be performing "obfuscated memorization," in the sense that the generated images are scrambled up enough to fool their plagiarsim-detector loss function.
[0] http://make.girls.moe/technical_report.pdf