(I was going to post "run a bot motel" as a topline, but I get tired of sounding...

Terr_ · on Dec 31, 2024

Wouldn't your own LLM be overkill? Ideally one would generate decoy junk more much efficiently than these abusive/hostile attackers can steal it.

uludag · on Dec 31, 2024

I still think this could worthwhile though for these reasons.

- One "quality" poisoned document may be able to do more damage - Many crawlers will be getting this poison, so this multiplies the effect by a lot - The cost of generation seems to be much below market value at the moment

m3047 · on Dec 31, 2024

I didn't run the text generator in real time (that would defeat the point of shifting cost to the adversary, wouldn't it?). I created and cached a corpus, and then selectively made small edits (primarily URL rewriting) on the way out.