By Charlene Li
After much speculation, Google Base is finally live. In my briefing with them last week, Google said that early reports that Google Base was targeting classifieds were inaccurate – instead, they see it as a way to acquire more information that its crawlers can’t pick up. (It’s pretty much what I wrote about two weeks ago). Here’s how they explained Google Base, a sumary of my experience, and a few thoughts on what this means at the end.
Google sees Base as a natural step in their quest to “organize the world’s information. Most of Google’s index is collected through the Googlebot crawlers that follow links around the Web. But if that information isn’t in a Web page with static links, it’s really hard for the crawlers to index it. Google has other ways to collect information, for example, Google Print and Google Catalog where Google scans offline publications for online searching. Google Video is a similar solution for video content owners while retailers can upload product feeds into Froogle. And Sitemaps makes it easy for Webmasters to upload an entire site to Google – especially helpful for dynamic sites built on databases. Oh, and don’t forget about Blogger and Google Groups for user-generated content. So Google sees Base is an extension of this information gathering effort. Users are essentially creating their own databases – they can use existing templates or their own.
I had a go at it – I put in my personal profile which I found a bit confusing at first but then pretty quickly figured out. It took about 10 minutes for the item to clear through their “vetting” process. There are templates that you can use (and also append), or you can start from scratch.
To access the information, users conduct a search on Base. Google contends that they may elect to include information from Base in general search results – while they didn’t show how this would work, I would imagine that they would do this in much the same way they integrate search results from Print, Froogle, or Local today – as a module that appears above the regular Web search results. And here’s the most interesting part – I asked Google how they planned to integrate Base content to Web-crawled content, and in particular, how they would determine relevance. No answers at this point, but it certainly bears watching. My hunch is that if/when Yahoo! decides to pursue something similar, that they will have an advantage in doing this, as Yahoo! has long been integrating paid inclusion and Yahoo! Directory items with their Web crawled results.
I also asked Google how it was going in their discussions with content owners (several newspapers have confided to me that they had been approached by Google to upload their classifieds database, but that they were skeptical and wary). Google said that there has been some strong interest as these database owners realized that they would benefit from Google sending them traffic. In a quick look at the classifieds, I found listings from The New York Times, CareerBuilder, and Dealer Specialties’ GetAuto site. If this is going to a newspaper classifieds killer, these newspapers and sites are, at least for now, willing to support Google’s service in order to drive more (free) traffic to themselves.
But as comments to my original blog post on Google Base point out, just having the data isn’t enough – you’ve got to be able to DO something with the data and no, just being able to search the information isn’t nearly enough. And this is where I think Google is on to something very big. At its core, Google Base is just one very big database of highly structured information. I can’t believe Google will just let it sit there, and instead, will develop APIs on which developers can build applications, in much the same way it allows them to create mash-ups around Google Maps. So rather than have to figure out, build, and maintain lots of different applications, Google will allow developers to access the information, on the condition that the applications be “Base enabled”.
Does this sound familiar? Microsoft’s Windows Live and Office Live are built on a similar premise (albeit sans database -- at least for now) where Microsoft supplies the backend infrastructure and hosting, some tools and data, and a place where developers can market their applications to users.
One last thought on Google Base – right now, anything I post to Base is public, but I may want to keep something private, or accessible to a specific social network. At some point, Google is going to have to allow users to set up these permissions, which adds a layer of complexity to searching. If I’m doing a search for a particular recipe, and I have permission to look at my extended family members’ Base content, Google would have to parse out that information in real time. Not an easy feat, at least on the surface.
So I’m curious to hear from you how you would/wouldn’t use base -- please add your comment below or email me. I think at the core, it’s not only a matter of technology and feature/function set, but also a matter of how much do I trust Google to do the right thing with my data. My hunch is that we will.
Update: Here are links to relevant Google Base posts:
- Google Blog -- includes quotes from companies using Base
- Google's FAQ on Base - includes screenshots. Check out the last FAQ -- because Base creates a unique page for your item, you can buy AdWords that point to it. Hmmm, the profit motive begins to show its hand. Your data now resides on a page instead of in a database, so you can advertise it.
- Search Engine Watch - Danny has a nice review of the ins and outs of the service