You checked it against the assertion of the most interested party? It's not "cli...

user24 · on Feb 2, 2011

> You checked it against the assertion of the most interested party?

There are currently exactly two sources: The Google post and the MS post. Who are more likely to know about what MS are doing? MS. I check my theory about what MS are doing with the source most likely to contain correct information.

> It's not "click" data

It's 'clickstream' data (MS's term). A 'click' comes from a page and goes to a page. That's the data MS were capturing. The page the click happened on (query happens to be included in URL), and the page the click went to. It's click data.

Terretta · on Feb 3, 2011

Your assertion Bing is most likely to be accurate about Bing ignores self-interest and spin.

However, agreed -- clickstream means series of clicks, and the actual data is a series of URLs.

The query "happening" to be in the URL has no "search > result" meaning without a parser being told to look for Google's particular keyword query indicators and correlate the subsequent page. As most URLs are not searches, this is not emergent behavior; it's programmed.

People also talk about this being a "weak" signal, but given search volume (or clickstream volume if you prefer) on Google versus other sources, even if this code is generic (e.g., recognize all "q=blah" or "search=blah" as keywords and correlate the following URL), it seems the signal would be strong indeed. Google's weak signal would provide several times more correlative data to Bing than Bing's own clicks.

Not that there's anything wrong with that! But Bing's blog assertions feel disingenuous -- they play this game well:

http://www.wired.com/epicenter/2009/06/kayak-bing/

user24 · on Feb 3, 2011

> Your assertion Bing is most likely to be accurate about Bing ignores self-interest and spin.

We can't apply skepticism to one source and not the other. Either Google and Bing are not blogging with self-interest and spin - and thus Bing are more trustworthy because they're blogging about themselves, or they both are blogging with self-interest and spin - and still Bing are more trustworthy because they're blogging about themselves.

You just can't legitimately discount what Bing say because of self-interest and spin without also discounting what Google say for the same reasons.

> The query "happening" to be in the URL has no "search > result" meaning without a parser being told to look for Google's particular keyword query indicators and correlate the subsequent page.

No. remove non-alpha from entire URL with no preconception about search queries or any of that. You're left with "google com search q QUERYTERM". All the words apart from QUERYTERM has plenty of other signals in Bing's system. If QUERYTERM is a highly unusual word then all Bing have to go on is the data they gleaned from Google.

Terretta · on Feb 3, 2011

> We can't apply skepticism to one source and not the other. Either Google and Bing are not blogging with self-interest and spin - and thus Bing are more trustworthy because they're blogging about themselves, or they both are blogging with self-interest and spin - and still Bing are more trustworthy because they're blogging about themselves.

Libel laws prohibit indiscriminate accusation, while there's no law against puffery.

The premise a company's own public relations messaging is more trustworthy than an outsider because the company's PR is about themselves seems without merit -- otherwise we would deem companies more trustworthy whenever an outsider points fingers, and send all the journalists, whistleblowers, and wiki-leakers home. "Nope, sorry, I believe the company, because they're talking about themselves."

Google presented incontrovertible data. Bing's PR tactic is "Google does this or that worse thing and profits off it" -- distracting hand waving -- "plus we're not copying anyway" -- deliberately disingenuous.

Responsive would be:

Of course our toolbar is recognizing search terms across the top N search sites, and correlating human selected results as an indicator of search intent and result quality for those search terms. This is the same thing you do when you look at your own web stats and check inbound search terms for your own pages: 'How relevant are my pages, and am I showing my users what they are looking for?'

This is the very definition of 'improving your search experience' as outlined when you install our toolbar. We're thrilled so many of you chose our Internet Explorer browser and Bing Toolbar that this provides us meaningful data on user search intent. We want to thank Google for demonstrating we are truly 'improving your search experience' using well accepted Internet crowd-sourcing techniques.

We agree, however, that generating correlations solely from competitor listings -- when we have no existing correlation in our own data corpus -- could be misperceived, so going forward, we will not create correlations solely from competitor results where none existed in our data. However, like every webmaster, we will continue to use crowd-sourced search term and results data from across the web to refine our suggestion order towards best predicting the information you want to find.

user24 · on Feb 3, 2011

Nice response. I wrote my own "What Bing should have said" response too - http://www.puremango.co.uk/2011/02/what-bing-should-have-sai... - I think we pretty much agree.

> The premise a company's own public relations messaging is more trustworthy than an outsider because the company's PR is about themselves seems without merit

That sounds right. Hmm, fair point.