If most of Reddit's new content is spambots pretending to have conversations in order to promote their product, why would anyone pay for that? Providing Reddit data to LLM trainers is directly encouraging this outcome, so it's shortsighted.
You've missed my point. Why would anyone pay for it anyway, and is that greater than the opportunity cost of waiting? They already have many billions of unadulterated comments that would work great as training data. How is a couple more, that everyine here seens to think will be corrupted anyway, going to improve the value calculation? Reddit's in the business of running a business, not a public benefit time capsule. You can't criticize just one side of the balance without mentioning the other, so to speak. (And actually, that's worth asking, tangentially: Do you think reddit's already being contaminated by spambots, or that the only way this happens is if reddit itself joins it?)