Saturday, March 10, 2007

The Waning Relevance of Search Engines

When the Web was young -- i.e., before the advent of Yahoo! and Google -- USENET[sic] was the primary medium for fielding informational queries on the Internet, and USENET FAQs were the primary medium by which such queries were answered. Incidental to such FAQs were pointers to particularly useful URLs that people had discovered and vetted by following trails of informational bread crumbs. Most of these trails started on USENET. However, over time, more and more Home Pages on the World Wide Web became populated with links to interesting and useful websites, and then along came the spiders.

Spiders (now more commonly known as web crawlers) created the first indexes[sic] of the World Wide Web by following links from one URL to another and making a record of what they found along the way. An individual spider or collection of spiders working together can produce a somewhat comprehensive database of URLs on the World Wide Web. However, that database is (at best) a collection of snapshots from some time in the recent past rather than a live feed that includes recent changes, and it will not include URLs found on "The Invisible Web." These distinctions were once lost on most people, who assumed that a "'Net search" would magically provide up to the minute information from the Web; this distinction is still lost on some people who have no idea how a search engine works.

The term search engine predates the Web. The earliest reference to the term that I've found was by Professor Lee A. Hollaar of Utah University back in March of 1985. (The Utah Text Search Engine: Implementation Experiences and Future Plans, Proceedings of the Fourth International Workshop on Database Machines.) But the term really caught on with the Internet community in the early 1990s. By that time, spiders were already cataloging an enormous amount of content found on the Visible Web. Even so, very few people otherwise in the know appreciated just how important search engines would soon become, erroneously assuming that anything more complex than the Unix "grep" query was overkill.

A former colleague of mine was a product manager for Netscape before the company was acquired by America Online, and he once recounted to me how popular the search function was on the Netscape website from the very beginning. Meanwhile, almost everyone at Netscape was trying to figure out how to sell the company's Web browsers and its easily forgotten line of related software products, all of which became completely irrelevant when Microsoft started giving away the Internet Explorer Web browser. Even Yahoo! failed to recognize the importance of search-related products, quickly diversifying from a searchable Web directory into a Web portal and outsourcing its search services to Google until a few years ago.

For all its flaws, the Google algorithm is still the standard by which all other search engine algorithms and post-Google information retrieval mechanisms are judged. The only post-Google innovation in content indexing and retrieval that even comes to close to being such a standard is Wikipedia, and most of those who consider Wikipedia a successful innovation do so because of Wikipedia's prominence and visibility in Google's search results. Even the blogosphere's importance is largely validated by the impact that it has on Google's search results. However, the importance of search engines reached its zenith a while back, and their relevance is slowly waning.

As I reported previously, Wikipedia recently announced that outbound links from Wikipedia to other websites would include a "nofollow" attribute. The rationale for this decision is that it will reduce the incentive for "black hat" search engine optimizers (SEOs) to spam Wikipedia, as a link from an important website like Wikipedia will normally inflate the PageRank that Google assigns to that link's destination URL. However, I think the end result will be something quite different. To wit, to the extent that Google actually ignores outbound links from Wikipedia, Wikipedia will actually offer something unique and different from Google to its end users. I don't think this will put Google out of business, but it will eventually diminish the relevance of search engines as the primary medium for fielding online informational queries.

In a vein similar to Wikipedia, sites like Digg, Fark, Technorati, reddit, and are emerging as places for the purveyors of information to hook up with their audiences with less and less concern for the impact that such sites have upon a(n) URL's Google PageRank. Moreover, like Wikipedia, certain online portals are so prominent that the only reason anyone uses a search engine to find them is because many end users are not in the habit of using bookmarks to pull them up -- e.g., eBay and MySpace. Even people who choose to use a search engine to find websites with the products, services, or information that they want sometimes complain that highly commercialized search results require a serious researcher to dig to the second or third page of search results to find something useful. In sum, while far from obsolete, the relevance of search engines is slowly waning, and commercial interests are slowly turning their attention to other channels of online marketing.


Blogger Vasiliy Kiryanov said...

I like your article. The most I like point about search function on the Netscape website. I have not think about it erlier. But it was really one of the good search engines of that time.

Wednesday, September 26, 2007 9:48:00 AM  

