Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What's wrong with scraping a bunch of pages. As long as they are following robots.txt, it's no big deal.


Or even better, have contracts with the companies. Maybe unlikely for them, but I think “scraping” is too often assumed to be “bad” in some way. The company I work for does a lot of web scraping, but we have contracts with our partners to scrape their websites. They may still have robots.txt that ask users not to scrape some areas, but we are allowed to bypass those.


HN always had a boner for web-crawler hate. How dare you automate downloading of public data others post online!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: