Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

FWIW: I've been scraping the shit out of social media for my AI training. I also do Amazon, AliExpress, etc.

Libs like puppeteer is so good these days that it's impossible to tell real users from fake traffic. Most of the blocks are just IP blocks.



Right. And the IP blocks only add a small cost to the scraping because it forces people to use residential IPs which can't sanely be blocked.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: