Thursday, February 22, 2007

Grokking the Semantic Web

Although I've written about what the Semantic Web is previously, I've always felt that there should be a simpler way to explain it. And while exploring the blogosphere, I found a blog post on ....more semantic! that finally did just that by referencing Wikipedia as a global ontology:
"[Y]ou could . . . use a wikipedia reference to indicate the semantic concept that you are writing about. Thus, . . . you could use the link to indicate that you are refering to the city of Rome, the capital of Italy."
This simple statement sums up just how straightforward it would be for someone to publish a document on the World Wide Web that is compatible with the Semantic Web. It also makes it abundantly clear just how tedious and time consuming such a task would be.

Most people who describe the Semantic Web talk about making web documents easier for machines to process. That's true enough, but that's really not what the Semantic Web is all about. At its essence, the Semantic Web is all about disambiguation, something that Wikipedia does quite well, even breaking through language barriers in the process.

The other day, a friend of mine who is taking a conversational Spanish class asked me if I could help her find something that was written in Spanish to complete an assignment for her class. I immediately thought of Wikipedia and how it links from articles in one language to articles in another language, so I told my friend to find a Wikipedia article in English that she liked and follow the link to the Spanish Wikipedia. I then directed her to use Babelfish for a crude translation of the article on the Spanish Wikipedia.

The Wikipedia article that my friend picked was Cat. However, when I followed the link to the corresponding article on the Spanish Wikipedia, I discovered that the two articles were quite different, so I found the appropriate article on the Spanish Wikipedia and changed the outgoing link on the English Wikipedia. Prior to the change I made, the English Wikipedia considered Cat to be a synonym for House cat, but linked to the more generic Spanish Wikipedia article entitled Felis rather than the more specific article entitled Gato dom├ęstico, which in turn redirected to Felis silvestris catus.

This simple exercise demonstrates how a semantic search engine algorithm could drastically improve the relevancy of keyword-based search results, something that I hinted at in my previous XODP Blog post entitled Wordnet, Disambiguation, and the Semantic Web. And contrary to what some have suggested, a semantic search engine would not need to be intimately familiar with a user and/or the context of a particular user's search. It could default statistically to the most likely meaning of a particular word occurrence and still allow an end user to provide feedback on what he or she really meant.

All of this begs (or rather raises) the question of whether people need or want a user agent that is this sophisticated when it comes to searching the Web. I've done a significant amount of end user training, both for my private clients (most of whom are lawyers) and during presentations that I've made at Mandatory Continuing Legal Education (MCLE) seminars, and few people ever tax my knowledge base with their questions. In fact, I got my best reviews (5 out of a possible 5 for 90 percent of the queries asked of seminar attendees) by taking a full 30 minutes to explain the anatomy of hypertext links, knowledge that most Web-savvy individuals take for granted. Thus, I am inclined to believe that both the Semantic Web and semantic search will remain solutions looking for a problem for the foreseeable future; at best, they may become solutions made by geeks and for geeks.


Post a Comment

Links to this post:

Create a Link

<< Home