Table of contents
There is a lot of hype surrounding XML, and a lot of hype surrounding Java. Together these technologies propose to solve many of the most common (and persistent) general computing problems that have been around for the last 20 years. XML and Java are not revolutionary in the approach to solving these problems of interoperability of code and data across and within platform and application boundaries. Rather, XML and Java provide solutions to these problems by using the most successful strategies and techniques that have been honed and refined over the last 20 years of computing.
In the following paragraphs, I will highlight some of the most basic and important advantages that XML and Java provide to almost any system that uses them properly. This is by no means a comprehensive list of benefits, but items in this list should appear across just about any use of XML and Java technologies.
I will take a break from my normal pragmatic approach to getting you (the programmer) started with using XML and Java and just talk about the high level (design level) benefits of this wonderful combination. A good design is important to a good implementation for any system.
When you create your data using an XML editor (that you can write), you can not only input the content of your data, but also define the structural relationships that exist inside your data. By allowing you to define your own tags and create the proper structural relationships in your information (with a DTD), you can use any XML parser to check the validity and integrity of the data stored in your XML documents. This makes it very easy to validate the structure and content of your information when you use XML. Without XML, you could also provide this validation feature at the expense of developing the code to this yourself. XML is a great time saver because most of the features that are available in XML are used by most programmers when working on most projects.
By using XML and Java, you can quickly create and use information that is properly structured and valid. By using (or creating) DTDs and storing your information in XML documents, you have a cross-platform and language independent data validation mechanism (for free) in all your projects!
You might use XML to define file formats to store information that is generated and used by your applications. This is another use of the structured nature of XML. The only limitation is that binary information can’t be embedded in the body of XML documents. For example, if you wrote a word processor in Java, you might choose to save your word processor documents to an XML (actually your ApplicationML) file. If you use a DTD then your word processor would also get input file format validation as a feature for free. There are many other advantages to using XML and a file storage format for your applications which will be illustrated later in the chapter.
- XML parsers make your application code more reliable and quick to develop by providing validity checking on your XML documents (if you use a DTD).
- XML allows you to easily generate XML documents (that contain your information), since it is so structured.
- XML parsers allow you to code faster by giving you a parser for your all your XML documents (with and without DTDs).
XML documents may be stored in files or databases. When stored in files, XML documents are simply plain text files with tags (and possibly DTDs). It is very easy to save your XML documents to a text file and pass the text file around to other machines, platforms and programs (as long as they can understand the data). In the worst case scenario, XML documents (files) can be viewed in a text editor on just about any platform.
XML documents are also naturally committed to a database (relational or object) or any other kind of XML document store. There are commercial products available which allow you to save XML documents to an XML storage layer (which is not a database per se), like Datachannel’s XStore and ODI’s eXcelon. These XML store solutions are quite expensive ($10,000 to $20,000 range).
XML documents are also quite naturally retrieved from a persistence layer (databases, file systems, XML stores). This lends XML to be used in real world applications where the information being used by different parts of a system is the most important thing.
Information in an XML document is stored in plain-text. This might seem like a restriction if were thinking of embedding binary information in an XML document. There are several advantages to keeping things plain text. First, it is easy to write parsers and all other XML enabling technology on different platforms. Second, it makes everything very interoperable by staying with the lowest common denominator approach. This is the whole reason the web is so successful despite all its flaws. By accepting and sending information in plain text format, programs running on disparate platforms can communicate with each other. This also makes it easy to integrate new programs on top of older ones (without rewriting the old programs), by simply making the interface between the new and old program use XML.
For example, if you have an address book document stored in an XML file, created on a Mac, that you would like to share with someone who has a PC, you can simply email them the plain text address book XML document. This cant be done with binary encoded information which is totally platform (and program) dependent.
Another example is web enabling legacy systems. It is very feasible to create a Java web ennoblement application server that simply uses the services provided by the underlying legacy system. Instead of rewriting the legacy system, if the system can be made to communicate results and parameters through XML, the new and old system can work together without throwing away a company’s investment in the legacy system.
By making the W3C the keeper of the XML standard, it ensures that no one vendor should be able to cause interoperability problems to occur between systems that use the open standard. This should be reassuring to most companies making an investment in this technology, by being vendor neutral, this solution proposes to keep even small companies out of reach of big companies choosing to change the standards on them. For example, if a big company chooses to change the platform at its whim, then most other companies relying on that platform suffer. By keeping all data in XML and using XML in communications protocols, companies can maximize the lifetime of their investment in their products and solutions.
By being language independent, XML bypasses the requirement to have a standard binary encoding or storage format. Language independence also fosters immense interoperability amongst heterogeneous systems. It is also good for future compatilbilty. For example, if in the future a product needs to be changed in order to deal with a new computing paradigm or network protocol, by keeping XML flowing through the system, addition of a new layer to deal with this change is feasible.
By defining a set of programming language independent interfaces that allow the accessing and mutation of XML documents, the W3C made it easier for programmers to deal with XML. Not only does XML address the need for a standard information encoding and storage format, it also allows programmers a standard way to use that information. SAX is a very low level API, but it is more than what has been available before it. DOM is a higher level API that even provides a default object model for all XML documents (saving time in creating one from scratch if you are using data is document data).
SAX, DOM and XML are very developer friendly because developers are going to decide whether this technology will be adopted by the majority and become a successful effort towards the goal of interoperable, platform, and device independent computing.
XML is derived from SGML, and so was HTML. So in essence, the current infrastructure available today to deal with HTML content can be re-used to work with XML. This is a very big advantage towards delivering XML content using the software and networking infrastructure already in place today. This should be a big plus in considering XML for use in any of your projects, because XML naturally lends itself to being used over the web.
Even if clients don’t support XML natively, it is not a big hindrance. In fact, Java with Servlets (on the server side) can convert XML with stylesheets to generate plain HTML that can be displayed in all web browsers.
Using XML to pass parameters and return values on servers makes it very easy to allow these servers to be web-enabled. A thin server side Java layer might be added that interacts with web browsers using HTML and translates the requests and responses from the client into XML, that is then fed into the server.
By not predefining any tags in the XML Recommendation, the W3C allowed developers full control over customizing their data as they see fit. This makes XML very attractive to encoding data that already exists in legacy databases (by using database metadata, and other schema information). This extensibility of XML makes it such a great fit when trying to get different systems to work with each other.
Since the structure of the XML document can be specified in DTDs they provide a simple way to make it easier to exchange XML documents that conform to a DTD. For example, if two software systems need to exchange information, then if both of the systems conform to one DTD, the two systems can process information from each other. DTDs are not as powerful as some kind of schema architecture for XML, they don’t support typing, subclassing, or instantiation mechanisms that a schema architecture must have.
DTDs are a simple way to make sure that 2 or more XML documents are of the same “type”. Its a very limited approach to making “typed” XML documents shareable across systems. In the future some kind of schema system will be proposed by the W3C that should allow typing, instantiation and inheritance of information (in XML).
All of the advantages of XML outlined so far all make interoperability possible. This is one of the most important requirements for XML, to enable disparate systems to be able to share information easily.
By taking the lowest common denominator approach, by being web enabled, protocol independent, network independent, platform independent and extensible, XML makes it possible for new systems and old systems (that are all different) to communicate with each other. Encoding information in plain text with tags is better than using propietary and platform dependent binary formats.
XML provides solutions for problems that have existed for the past 20 years. With most applications and software services using the Internet as a target platform for deployment, XML could not have come at a better time. With the web becoming so popular, a new paradigm of computing has emerged for which XML supplies one of the most important pieces, platform, vendor and application neutral data. Regardless of the programming language used to process XML, it will enable this new networked computing world.
Java is also a key component of this new paradigm. On the server side, by working with XML, it can more naturally integrate legacy systems and services. With XML, Java can do what it does best, work very well on the server side, and web (and Internet) enable software systems.