Enter your email address:

Delivered by FeedBurner

About This Blog

Josh’s Tweet Stream

  • More tweets

« Google launches RSS for Google News | Main | Local search developments: Yahoo! Local and Verizon’s pay-for-call »

August 19, 2005

Why search index size no longer matters

By Charlene Li

There’s already been a great deal written about the debate between Yahoo! and Google on their relative index sizes (see my previous post as well as Search Engine Watch’s two posts and John Battelle’s numerous posts on it.) Like them, I was subjected to numerous phone calls and meetings with both Google and Yahoo! over the past week. Rather than add to the debate, I’d like to talk about what this debate means and the implications for the future, primarily that reported size just doesn’t matter any more.

As way of background, when Google first called me last week to say that they couldn’t confirm Yahoo!’s index size, my first question to them was, “Are you saying that Yahoo! is lying?” Obviously, Google never said this outright, but from their discussions, they wanted analysts and press to come to that conclusion. Numerous charts and test examples provided by Google intimated that Yahoo’s claim of 20 billion documents was based on inaccurate counting at best, deliberate obfuscation at worst.

After a week of watching Google flex its PR muscle, Yahoo! responded that it never said that it’s index was bigger than Google’s – only that it was 20 billion documents deep. Yahoo! said that it could never verify another search engine’s index size and couldn’t make any such comparison. Yahoo! also strongly stated to me that for another search engine to purport that it could do the same was unfounded.

I can understand Google’s ego taking a HUGE hit as it has built its reputation and company’s culture on the fact that it is/was the “biggest” search engine out there. My advice to them was rather than challenge Yahoo! on the actual size of the index, to move beyond and concentrate on relevancy. Yet throughout this whole debate, both Google and Yahoo! continued to focus the debate on index size rather than provide data on how searches are more relevant.

But on to the implications. I think Google has shot itself in the foot as the importance of index size is being widely disputed. Eventually, Google will come out with an update announcement that its index is at XX billion documents (presumably north of 20 billion). Rather than gasp in wonderment at how big Google is, we’ll all just shake our heads and say, “There they go again!”.

Now some have called for standards and a way to audit index size, in the belief that understanding the size is important. Hypothetically, size does matter, but an audit plays directly into Google’s argument that there is “right” way to count documents in an index (and they will argue strenuously that their way is the best). The reality is, index construction and counting is highly customized and proprietary to every index and hence, can’t be standardized or audited. In the same way, relevance lies in the eye of the beholder – every search engine has relevancy metrics and I can guarantee you that they think they all show up at the top of their scales!

So we’ll continue to see “index envy” taking place between the search engines, but it’s clear to me that index size is no longer anything that outsiders can use to gauge how “good” a search engine is. Indeed, as personalized search, vertical search, and integration of content into search results becomes more important in determining how well we like a search engine, index size will quickly become irrelevant.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c50bf53ef00d8345a8a1569e2

Listed below are links to weblogs that reference Why search index size no longer matters:

» search index size does matter from Summation
Charlene and everyone else writes that index size doesn't matter ... it is all relevancy. actually, i think it matters a lot. a real lot. relevancy is very important, of course ... but relevancy comes after size ... let me [Read More]

» More On The Index Size Debate from Search Engine Watch Blog
I wrote earlier of the dispute over index size between Google and Yahoo in my Screw Size! I Dare Google & Yahoo To Report On Relevancy post. Over the past week or so, Gary Price and I've had several conversations with both sides and have been worki... [Read More]

» Google About To Increase Index Size? from Search Engine Watch Blog
I'm still working on my revisit to the entire Yahoo-Google size debate, but we may be about to see the long-expected response of Google raising its figures shortly. A reader drops a note that on some Google data centers, you can find major increase in ... [Read More]

» This isnt methis isnt methis isnt meoh, there I am! from kaiberie.com
Relevancy and size are interlinked when it comes to WHY certain results turn up in certian places - as are the links and the algorithms that control search engine results display. ... [Read More]

» Advanced MP3 Catalog Download from MP3 Catalog
Download advanced mp3 catalog pro Advanced MP3 Catalog is designed for anyone ... Generate and print reports and CD covers, export your catalog, search for ... [Read More]

Comments

GregBurton

Good points and post. The real question with search is "Can I find what I'm looking for without too much trouble?" Generally, the answer is yes on all the major engines. But they all have to do brand differentiation on something, and so index size is the easy common denominator.

Tim Taylor

So are you saying it's not the size of your index but how you use it? I will try to work "index envy" in to my next search conversation though.

I'm with Greg's comments. I can generally find what I want on any search engine. But if I get more than 10 results, I'm probably not picking something further down on the list. If I don't get what I'm looking for, I refine my search not looking at search results 11 through 1 million.

The comments to this entry are closed.