Monday, August 07, 2006

What Is the Semantic Web?

In a recent blog post, I said that the ongoing efforts to develop the Semantic Web provide a surprisingly coherent vision of how the Internet should be indexed. However, beyond a passing reference to the fact that the Semantic Web uses a declarative ontological language known as OWL, I didn't really say what the Semantic Web is or what it does. Tim Berners-Lee has published a draft of a road map for his vision of the Semantic Web, and while Berners-Lee's road map is purportedly still evolving, his draft road map was last updated in October of 1998, leading me to believe that his vision has not changed that much.

Berners-Lee's vision of the Semantic Web was and is a stylized version of Artificial Intelligence ("AI"). To wit, "Leaving aside the artificial intelligence problem of training machines to behave like people, the Semantic Web approach instead develops languages for expressing information in a machine processable form." As a long term vision, the idea seems to be that people who publish high quality Web-based resources in a machine-processable language will make other Web-based resources obsolete. At a more practical level, the Semantic Web relies upon the Resource Description Framework (RDF) to provide a standardized schema for meta data.

The fundamental problem with trying to impose a standardized schema for meta data on Web-based publishers is that the Internet is a highly decentralized information resource. Anyone can create their own schema for Web-based URIs/URLs, and said URIs/URLs are seldom unique. Indeed, to provide total anonymity and combat censorship, the Freenet Project relies upon user-designated URIs.

At the present time, the closest thing to a Web-based hegemony of meta data is Google. I'm not talking about Google's Sitemaps feature (still in beta and soon to be renamed "Google webmaster tools"), although that is a very good example of the phenomenon. Rather, what I'm talking about is that people who want their fair share of Google's referral traffic do their best to make their websites Google-friendly. Of course, few people who are optimizing their websites for Google are concerned about quality control, and Google's meta data is easily compromised by people who have learned how to game the Google algorithm.

Ignoring for the time being the fact that the quality of meta data for the Web is easily compromised, consider the fact that RDF pretends to use a string of URIs to create semantic meta data that is machine processable. For the most part, this meta data is simplistic gibberish that mimics subject-verb-object grammar. However, when it comes to large scale online collaboration, RDF does provide a workable framework for the sharing of meta data that is currently limited to private databases.

Ongoing development of the Web Ontology Language (aka OWL) -- i.e., the declarative ontological language of the Semantic Web -- resulted in the publication of a W3C recommendation that as of February 10, 2004 was still somewhat esoteric. It remains to be seen whether OWL will ever garner enough proponents to make the Semantic Web a force to be reckoned with. But AI is not going to go away, and if the Semantic Web isn't the killer application that makes AI a reality, the Semantic Web will still provide a coherent vision for how the Internet should be indexed for decades to come.


Post a Comment

Links to this post:

Create a Link

<< Home