Wednesday, December 16, 2009

XML and Web Services

This post is the first of a series on XML and its use in Web Services.  XML stands for EXtensible Markup Language and is a markup language you can use to create your own tags.  XML was created by the World Wide Web Consortium (W3C) to overcome the limitations of HTML, the Hypertext Markup Language that is the basis for all Web pages.

Like HTML, XML is based on SGML -- Standard Generalized Markup Language. Although SGML has been used in the publishing industry for decades, its perceived complexity intimidated many people that otherwise might have used it. XML was designed with the Web in mind.

XML is not a replacement for HTML as XML and HTML were designed with different goals. XML was designed to transport and store data, with focus on what data is. In contrast, HTML was designed to display data, with focus on how data looks. HTML is about displaying information, while XML is about carrying information.

HTML is the most successful markup language of all time. You can view the simplest HTML tags on virtually any device, from palmtops to mainframes, and you can even convert HTML markup into voice and other formats with the right tools. Given the success of HTML, why did the W3C create XML? To answer that question, take a look at this example HTML code:

<p><b>Mr. John Smith</b><br>
325 Knightsbridge Court<br>
Some Place, CA 94503-4160</p>

Here anyone can determine that this is a mailing address.  The problem with HTML is that it was designed with humans in mind. Even without viewing the above HTML code in a browser, one can determine it is a mailing address. (Specifically, it's a mailing address for someone in the United States; even if you're not familiar with all the components of U.S. postal addresses, you could probably guess what this represents.)

As humans, we have the intelligence to determine the meaning and intent of most documents. A computer or machine, can't do that yet. While the tags in this document tell a browser how to display this information, the tags don't tell the browser what the information is. We know it's an address, but a machine doesn't.

 Now let's look at an example of XML coding. With XML, you can create your tags and assign some meaning to the tags in the document. In HTML, tags are pre-defined. More importantly, it's easy for a machine to process the information as well. You can extract the postal code from this document by simply locating the content surrounded by the and tags, technically known as the element.
<address>
<name>
<title>Mr.</title>
<firstName>John</firstName>
<lastName>Smith</lastName>
</name>
<street>325 Knightsbridge Court</street>
<city>Some Place</city>
<state>CA</state>
<postalCode>94503-4160</postalCode>
</address>
XML was not designed to be a replacement for HTML. The latter was designed to display data with the emphasis on how the data looks in different browers. XML was designed to store and transmit data with emphasis on what the data is. XML is about creating documents with self-describing data.

How XML is changing the Web? Here are a few key areas:

* XML simplifies data exchange. Different organizations (or even different parts of the same organization) rarely standardize on a single set of tools. As such, it can take a significant amount of work for applications to communicate with each other. Using XML, each organization creates a single utility that transforms their internal data formats into XML and vice versa. In addition, there's a good chance that their software vendors already provide tools to transform their database records (or LDAP directories, or purchase orders, and so forth) to and from XML.

* XML enables smart code as XML documents can be structured to identify every important piece of information (as well as the relationships between the pieces). It's possible to write code that can process XML documents without human intervention. The fact that software vendors have spent massive amounts of time and money building XML development tools means writing that code is a relatively simple process.

* XML enables smart searches. Although search engines have improved steadily over the years, it's still quite common to get erroneous results from a search. If you're searching HTML pages for someone named "Amber," you might also find pages on amber colors, amber light, and other non-relevant matches. Searching XML documents for elements that contained the text Amber would give you a much better set of results.

XML is now as important for the Web as HTML was to the foundation of the Web. It has the most common tool for data transmissions between all sorts of applications, and is becoming more and more popular in the area of storing and describing information.

No comments:

Post a Comment

Get your own Widget