Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We recently had a bot from Taiwan downloading all of our images, over and over and over - similar to the author. By the time we noticed they had downloaded them many times over and showed no signs of stopping!

Bots these days are our of control and have lost their mind!



I recently found out that Bytedance was scraping a website of mine over and over again. I don't care about their stupid AI crawler scanning my cheapo server, but they were hitting the same files from different IP addresses, all from the same /56 China Telecom subnet.

I added a firewall rule to block the subnet and that seems to have worked. Earlier attempts involving robots.txt failed and my logs still got spammed by all the HTTPS requests when I blocked the bots in Nginx.

I don't understand how you could write a scraper like that and not notice that you're downloading the same files over and over again.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: