The Semantic Web

 Email to a friend

The Future is Here
By Ayesha Khanna

"If [computer networking] were a traditional science, Berners-Lee would win a Nobel Prize,” Eric Schmidt, CEO of Novell, once commented. Indeed, Tim Berners-Lee revolutionized the world when he created the web in 1991. Now, he is talking about the second generation of the web and his talks are generating buzz, the W3C is establishing standards for it, and universities, companies and industry consortiums are building the technologies necessary for it. He refers to it as The Semantic Web.

The Semantic Web is envisaged as a place where data can be shared and processed by automated tools as well as by people. The key lies in the automation and integration of processes through machine-readable languages. In order to leverage and link the vast amounts of information available on the Web, software agents must be able to comprehend the information i.e. the data must be written in machine readable semantics. For example, whether I use the tag "dead" or the tag "alive" next to a person’s name in on my webpage makes no difference to a piece of software. Some additional semantics or meta data must be added to it in order for a software program to make an intelligent assessment of the state of the person. This metadata or meaning (vs display) of information is what is known as semantics.

Semantic: of or relating to meaning in language.

Let’s consider an example of the advantages of having semantics that add meaning to the information on the Web. Say you live in New York and decide to attend a conference in London. You would have to go to many airline websites and look at all flights leaving from New York to London. Then, you would go to various hotels website and look for a hotel near your conference location that has a room available. That’s a fair bit of searching and entering information that you have to do. Luckily, you can search the information on the Web, and in most cases, you can pay for everything on the Web.

Now imagine another scenario. You are driving down 5th Avenue in Manhattan. Your secretary calls you on your cell phone and says that you’ve been asked to be the keynote speaker at the Banking Europe 2005 Conference on May 5, 2005. You say that’s great and you begin to make plans for your tip. You flip open your palm pilot which is connected to the web and you type in some commands: Book return ticket from New York to London May 5 -11; book room in hotel near the conference location, Hilton London Metropole, in London.

Your palm pilot has a software program or software agent that understands your commands: it processes the semantics of your command intelligently. Your agent buys your ticket and books a room in a hotel. As you drive into your garage, your palm pilot beeps and asks you to confirm the information. You park your car, confirm the bookings, and then you go inside and dress for your night out. This is just one example of how good and easy life gets when the web is an intelligent partner in your universe.

Ontologies for Knowledge Representation

In order for computers to provide more help to people, the Semantic Web augments the current Web with formalized knowledge and data that can be processed by computers. To be able to search and process information such as airline flights, software programs need information that has been modeled in a coherent manner. An ontology models all the entities and relationships in a domain.

Ontology: concerned with the nature and relations of being.

Continuing with our example, let’s create a hypothetical ontology for Virgin Atlantic’s flights. An ontology for the airline industry would model its meta data using the following semantics (in italics):

A flight has an origin, destination, flight number, departure time, arrival time, class {attributes}
A international flight is a type of flight {inheritance}
A flight can have one origin {one-to-one association}
A flight can have many class {one-to-many association}

In other words, ontology captures the attributes of an entity and inheritance relationships as in object-oriented programming and it also captures associations such as cardinality as in relational databases (See Figure 1).


The specific information or instance of this metadata for a particular flight may be as follows:

Flight Number: VS018
Origin: New York (EWR)
Destination: London (LHR)
Departure Time: 08:20, May 5 2005
Arrival Time: 20:00, May 5, 2005
Class: Economy

With these semantics, you can type the following commands, for example, for your software agent:

flight origin: “New York” destination: “London”
departure: “May 5, 2005”, arrival: “May 5, 2005”

Without a standard naming convention for concepts such as destination, your software agent cannot present your commands to Virgin Atlantic’s server. In addition, it is important that British Airways’ server understands these semantics as well so that you can search for tickets on that airline. When you model the concepts in a domain, such as the airline industry, and publish them, you are in essence creating an ontology.

The Semantic Web Architecture
Now that we’ve discussed both the vision of the Semantic Web and the necessity of ontologies for knowledge representation, it follows that we now explore the implementation of the model.

There are several important steps in the workflow of the example we discussed above:

(1) Modeling the specifics of a resource such as the Virgin Atlantic flight VS018 from New York to London.
(2) Modeling the concepts of the entire airline industry.
(3) Trusting that the information provided by an airline or a ticket broker is correct.
(4) The first three points consider information and its validity but what about the mechanics of sending commands and receiving results?

Semantics are covered by technologies that deal with creating the kind of ontologies discussed in the last section. What is necessary once information has been categorized and structured across the Web is to be able to trust it and to be able to process it. For these last two pillars of the Semantic Web, we'll discuss trust networks and software agents next.

Web of Trust

We can model the information but how do we trust the information that we get from the Semantic Web and how do we protect our information? If my software agent finds two travel agents, one of who says the price for a Virgin Atlantic ticket is $180 and the other says the price is $210, whom do I believe? In the Semantic Web, we depend on digital signatures and community networks.

Digital signatures are necessary to ensure that the information that claims to be coming from a source was not tampered with by the time it got to you and that it’s origin was indeed the source. Based on mathematics and principles of cryptography, digital signatures are an electronic form of the penned signature that we use when signing legal documents. You would be sure that your conference invitation came from the head of European Banking Commission, for instance, because the electronic invitation had the digital signature of the chairman of the board.

Community networks are networks of your friends that have verified certain sites as trustworthy. The concept is quite similar to credit rating agencies, such as Moodys, that rate the credit worthiness of countries and corporations. Here, a consensus is built amongst a community regarding the validity of a site, such as a travel agency, and used by others in the network. One such example is the FOAF (Friend-Of-A-Friend) network which is helping to build a prototype of the “Web of Trust” that Berners-Lee referrs to in his Semantic Web roadmap.


Software Agents & Semantic Web Services

Ontologies comprise the knowledge representation component of the Semantic Web but it is incomplete without software programs that can communicate with each other. We still need a mechanism by which a software agent goes to Virgin Atlantic and requests information on flights to London from New York on May 5, 2003. The best application for invoking other applications on the Web using request parameters is currently found in Web Services.

Web Services: software programs that can invoke tasks on each other across the Web.

Web Services are a layer of abstraction above software programs and allow Services to be located and invoked across the Web. Thus programs written in various programming languages on different platforms can call each other using the Web Services interface. The services offered by a company are published and advertised in a public registry. Software agents process the information in the registry to locate and use services. For example, a software agent would go to the yellow pages of the registry and search for a travel agent. The registry would contain the Internet address of the agency and how to communicate with it.

Software Agent: an artificial agent which operates in a software environment.

DARPA (Defense Advance Research Agency) has been working on an extension of Web Services known as Semantic Web Services. Semantic Web services build semantics and ontologies that describe services available on the Web. Using these service semantics, software agents will be able to understand the services described in the yellow pages registry. Semantic Web Services would, therefore, greatly enhance the capability of software agents to search and execute particular services without human intervention.

Present Efforts & Future Directions

There are three factors necessary for the success of the Semantic Web: first, the establishment of standards by the W3C or World Wide Web Consortium (the organization that is responsible for standardizing all the technologies related to the Web);second, the development of technologies that facilitate the implementation of software agents and other aspects of Berners-Lee’s vision; and third, the production of tools that encourage people to adopt the technologies that will facilitate the universality of the Semantic Web.

The W3C, led by Tim Berners-Lee and Eric Miller, have made great progress in the standards established for the Semantic Web. In 2002, several new recommendations and working drafts have emerged for OWL and RDF, the two main standards for the Semantic Web. Technologies such as Web Services and Digital Signatures are also examples of relatively recent developments that will greatly facilitate the implementation of the Semantic Web. Examples of implementations include Music Brainz which provides an encyclopedia of music marked up in RDF Friend-Of-A-Friend which uses RDF to mark up the identity of community members and provides a basis for a web of trust; and Retsina Calendar Software Agent which is an agent developed for calendar scheduling by Carnegie Mellon University.

Regarding encouraging people to markup their web information, I tend to agree with James Hendler, Professor at the University of Maryland and a prolific writer on the Semantic Web, that “ideally, most users shouldn’t even need to know that Web semantics exist”. Tools must be constructed that automatically pop up forms for ontology linkages in order to overcome the initial hesitance that people have in learning semantic markup languages. DARPA (Defense Advanced Research Projects Agency) is funding a number of such free tools so that people mark up their Web pages. One example is an ontology editor, Protégé, developed by Stanford University which is free and available for download from the Stanford website.

Of course, we have spoken of more than just individual web pages; in our hypothetical example, we considered the importance of ontologies and these are usually developed by industry consortiums. Luckily, creating ontologies is something that is already underway. Many industries have realized that they need industry standards to facilitate inter and intra firm communication. One example of this is FpML (Financial Product Markup Language), an ontology for financial instruments which will facilitate automated trading between banks.

These efforts all point to the growing importance, and in my mind, the inevitability of the establishment of the Semantic Web. Just because it sounds like science fiction doesn’t mean it’s impossible. The Semantic Web is an incredibly exciting and potential place for developers to work. It will revolutionize the way we interact, live and do business today. If you have seen movies like the Matrix and Minority Report, you have glimpsed the new kind of artificial intelligence that uses the Web to process information rapidly and automatically. Who knows? One day, you might very well be able to just speak to your small palm pilot or laptop instead of typing in the commands. Even today, companies such as IBM produce simple voice recognition software for computers that allows you to “speak” to your computer. The key to remember is that the computer needs a defined set of semantics for it to understand your commands and for it to be able to communicate with other software programs on the Web. For now, the W3C is defining standards, new technologies like Web Services and XML schemas have emerged that will make the transition easier, and industries and companies are focusing on making better models to represent their knowledge.

I predict that industries will develop ontologies which will be used for their internal communication. Eventually, each industry such as financial services, retail, and shipping will merge its internal ontologies and represent a coherent protocol for communication with their systems. At that point, the Semantic Web will evolve from existing in pockets to becoming a universal infrastructure. Eventually, with increasing unambiguous markup of web content, the Semantic Web will evolve to Tim Berners-Lee’s vision as “an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”

Published March 03, 2005

Email to a friend

Email this entry to:


Your email address:


Message (optional):