Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yours is the first explanation I've seen that makes any case for Microsoft not being intentionally cheating. All the responses I've seen from Microsoft are in the form "we're not copying Google we're just using user click information" which is poor since what Google showed is that they're associating click information with the search terms that were put into Google. But since those search terms are in the referrer URL it could conceivably be an innocent general algorithm weighing the words in the URL.

As for ignoring robots.txt, that may not be the case. Conceivably you could get the url A->B link, save the metadata for both, signaling them as related, and then check both URLs against robots.txt to see if you should have them in the index. Then if url A is ".../search?q=torsorophy" Google's robots.txt disallows it from being indexed and only url B gets in but the link to "torsorophy" is still there from the metadata.



In fact what Google users are really clicking in searches are "google.com/url?" URL's which are also disallowed in robots.txt(while the url they redirect to aren't).


Indeed. Certainly it is technologically possible for clickstream based indexing to still abide by robots.txt rules. However, the Bing toolbar does not. That is the key issue here.


How do you know it does not? I'm assuming robots.txt is about preventing the page contents from being crawled and added to the index. If all they use the click info for is to associate referrers (google URLs in this case) to pages in the index and they don't crawl the google search itself I don't see how that breaks the robots.txt contract.


The page contents are being crawled and added to the index, but by Bing Toolbar users, not a computer program. I consider that to be an underhanded way to circumvent robots.txt, but others might not.


What makes you say that? We haven't seen anything to indicate Google search pages are in the Bing index.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: