Monday, June 30, 2008

The Easiest Java XML Binding

I have an XML document and I want to use that document to populate a corresponding set of Java objects. This is a commonly encountered scenario when working with Java, so what is the easiest method for Java XML Binding that requires the least amount of code? You can't answer that question without first defining what "easy" is in relation to the bindings between Java and XML. To me "easy" is synonymous with "simple," so I am equating easiness to complexity. Complexity is of interest because things that are complex take longer to implement and are generally harder to maintain, based on my experiences with software in a variety of languages and platforms. Less code written means less code to maintain, and less code to write means less time is required to write that code.

So what are the methods for binding Java and XML?

  • JAXB - Requires Java Web Services Developer Pack v1.1 or later and the use of XSD's to bind the corresponding XML to Java interfaces (http://java.sun.com/developer/technicalArticles/WebServices/jaxb/).
  • JiBX - Open source project that uses binding definition documents to describe how XML is mapped to Java objects at binding runtime, where enhanced class files generated by the binding compiler build objects from an XML input document and can be outputted as XML documents (http://jibx.sourceforge.net/).
  • SAX/DOM/JAXP/Xerces-J - These are all types of Java XML parsers that require the manual handling of XML documents, meaning that you have to [in general] manually map the values from an XML document to the corresponding Java object (http://www.cafeconleche.org/books/xmljava/chapters/ch05.html).
  • Spring Beans - Uses the Inversion of Control container from the Spring Framework in order to bind values from special "bean" xml documents to Java classes. This is not useful for reading a specific XML document, but it is the binding that is of interest ; the automatic mapping of XML value to Java object field. (http://jvalentino.blogspot.com/2007/10/introduction-to-spring-framework.html).
  • EJB Beans - Enterprise Java Beans are a subject unto themselves, far to broad for discussion here. What is important though as with the inversion of control container from the Spring Framework, is the ability to bind values from XML documents to Java object fields (http://java.sun.com/javaee/5/docs/tutorial/doc/bnblr.html).

I have used all of these methods in depth before, but this isn't exactly what I want though:

  • I don't want to have to manually iterate through XML documents and manually map values.
  • I don't want to have to use XSD files.
  • I don't want to have to use libraries for binding compilers to use binding definitions to map to Java classes.
  • I don't want to have to modify existing XML documents so that I can use them as beans.

My usage of Adobe Flex (http://jvalentino.blogspot.com/2008/02/flex-tutorial-part-ii-language-concepts.html) over the last couple of years has spoiled my expectations for language and XML interaction; I want more with less.

I know exactly what I want: I want to have a series of Java objects that are representations of nodes within an XML document, and pass the XML document to the Java object representing the root XML node and have all of the corresponding Java objects populate using that XML. I then want to be able to go from Java back to XML, and I don't want to have to do anything other then say "XML, go to Java" and then "Java, go to XML."

Why should there need to be complex mappings and external definitions when Java and XML are being used to representing the same thing?

Consider an RSS 2.0 document and what it represents:

<?xml version="1.0"?>
<rss version="2.0">
  <channel>
    <title>Lift Off News</title>
    <link>http://liftoff.msfc.nasa.gov/</link>
    <description>Liftoff to Space Exploration.</description>
    <language>en-us</language>
    <pubDate>Tue, 10 Jun 2003 04:00:00 GMT</pubDate>
    <lastBuildDate>Tue, 10 Jun 2003 09:41:01 GMT</lastBuildDate>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <generator>Weblog Editor 2.0</generator>
    <managingEditor>editor@example.com</managingEditor>
    <webMaster>webmaster@example.com</webMaster>
    <ttl>5</ttl>
 
    <item>
      <title>Star City</title>
      <link>http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp</link>
      <description>How do Americans get ready to work with Russians aboard the
        International Space Station? They take a crash course in culture, language
        and protocol at Russia's Star City.</description>
      <pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate>
      <guid>http://liftoff.msfc.nasa.gov/2003/06/03.html#item573</guid>
    </item>
 
    <item>
      <title>Space Exploration</title>
      <link>http://liftoff.msfc.nasa.gov/</link>
      <description>Sky watchers in Europe, Asia, and parts of Alaska and Canada
        will experience a partial eclipse of the Sun on Saturday, May 31st.</description>
      <pubDate>Fri, 30 May 2003 11:06:42 GMT</pubDate>
      <guid>http://liftoff.msfc.nasa.gov/2003/05/30.html#item572</guid>
    </item>
 
    <item>
      <title>The Engine That Does More</title>
      <link>http://liftoff.msfc.nasa.gov/news/2003/news-VASIMR.asp</link>
      <description>Before man travels to Mars, NASA hopes to design new engines
        that will let us fly through the Solar System more quickly.  The proposed
        VASIMR engine would do that.</description>
      <pubDate>Tue, 27 May 2003 08:37:32 GMT</pubDate>
      <guid>http://liftoff.msfc.nasa.gov/2003/05/27.html#item571</guid>
    </item>
 
    <item>
      <title>Astronauts' Dirty Laundry</title>
      <link>http://liftoff.msfc.nasa.gov/news/2003/news-laundry.asp</link>
      <description>Compared to earlier spacecraft, the International Space
        Station has many luxuries, but laundry facilities are not one of them.
        Instead, astronauts have other options.</description>
      <pubDate>Tue, 20 May 2003 08:56:02 GMT</pubDate>
      <guid>http://liftoff.msfc.nasa.gov/2003/05/20.html#item570</guid>
    </item>
  </channel>
</rss>

Document from http://en.wikipedia.org/wiki/RSS_(file_format)

An "rss" node has a "channel" node, a "channel" node has "item" nodes, and every node has their own properties and attributes. These relationships can then be represented in terms of objects using a class diagram (ignoring methods for getters and setters):

From this class diagram it is then possible to generate the corresponding Java classes, assuming that the class fields have the same names as their corresponding XML nodes. This can't always work though since class fields can be represented in XML as properties or attributes, so it would be necessary to designate this in Java. You may also not want to have the Java class field names the same as the XML node names, so there would need to be a way to designate this as well. One easy way to designate specific things about classes, fields, and methods is to use annotations, so consider the following 3 annotations:

  1. @ClassXmlNodeName - used on a Java class to specify its corresponding node name in an XML document. This is required in order to designate a class as being bound to an XML document.
  2. @XmlAttributeName - used on a Java field to specify its corresponding attribute name in an XML document. This is required in order to specify that a particular class field should be represented in XML as an attribute instead of a property.
  3. @XmlNodeName - used on a Java field to specify its corresponding node name in an XML document. This is only required when a Java class field name is different than its corresponding XML node name.
@ClassXmlNodeName("rss")
public class Rss {
 private List<Channel> channels;
 @XmlAttributeName("version")
 private String version;
 
 //getters and settings go here...
}

@ClassXmlNodeName("channel")
public class Channel {
 private String title;
 private String link;
 private String description;
 private String language;
 private String pubDate;
 private String lastBuildDate;
 private String docs;
 private String generator;
 private String managingEditor;
 private String webMaster;
 private int ttl;
 private List<Item> items;

 //getters and settings go here...
}

@ClassXmlNodeName("item")
public class Item {
 private String title;
 private String link;
 private String description;
     private String pubDate;
     private String guid;
    
     //getters and settings go here... 
}

With the exception of the annotations, these Java classes are just like any other data transfer objects (DTOs) that would be used to represent a RSS XML document. These annotations act as the mappings between Java and XML when names and types do not provide enough information. With this information, the class representing the XML document root node, and the XML document it is possible to recursively map XML nodes to Java class fields and Java class fields back to XML nodes using Reflection. So using Reflection I wrote a library that can take any Plain Old Java Object (POJO) and an XML document, and use that XML document to populate that object and all of its child objects using one line of code. The same can then be done to convert a Java object back to its XML representation using a single line of code. All of this assuming that the Java classes correspond to the XML document.

Using this "XML Binder" library the following code takes the the given instance of an Rss object and populates it and its channels and items using the given XML document:

Rss rss = new Rss();
XmlBinderFactory.newInstance().bind(rss, new File("Rss2Test.xml"));

The following code can then be used to take that same Rss object once changes have been made to it and converts it back to XML:

String xml = XmlBinderFactory.newInstance().toXML(rss);

I assert that this can work with most XML documents that have appropriate corresponding Java classes, but I have only tested it out so far with RSS 1.0, RSS 2.0, and a general test XML document in no particular format. The question now is what do I do with this library; anyone interested?

Since there seems to be some interest I have started a sourceforge.net project called "Really Easy Java XML Binding" or "RE:JAXB" since this is sort of a reply to JAXB. Here is the link

I have included POJOs for RSS 1.0 and RSS 2.0 along with unit tests to verify that the mappings from XML and to XML are working correctly. What I am looking for are ways to improve the efficiency of the to and from binding code, as well as POJOs for other common XML formats.

19 comments:

James said...

That's a really good article. I'd like to repost this on JavaLobby with your permission. If you are interested, contact me - james at dzone dot com - and we can organise it.

Thanks
James

Ayman said...

That's nice. I started with a similar project but did not have time to complete it. The basics do work, and as you have done, I use reflection and annotation to map Java Objects to XML nodes.
The only main difference is that I'm starting out with Java Objects and want to have good XML out of them, not the other way around. You can have a look at the project (Xerialize) page on Google Code.

fargo said...

Nice article. But the question remains unanswered, which XML Binder library have you used?

Laurent said...

It looks a lot like Xstream. You should have a look. Anyway, it's a good article, thanks for sharing your experience

Wayne said...

Hi,

i think you might like to have a look at: http://xfire.codehaus.org/Aegis+Binding

regards
Wayne

John Valentino said...

"The only main difference is that I'm starting out with Java Objects and want to have good XML out of them, not the other way around. You can have a look at the project (Xerialize) page on Google Code."
-ayman

The XMLBinderFactory class that I wrote can do this as well. Your project looks similar to what I did. I knew that this had to have been done before, I just couldn't find anything.

"Nice article. But the question remains unanswered, which XML Binder library have you used?"
-fargo

I made my own; the XMLBinderFactory class. I plan on turning it into a sourceforge project soon, at least to be able to more easily distribute the source code.

"It looks a lot like Xstream. You should have a look. Anyway, it's a good article, thanks for sharing your experience"
- laurent

This is very similar to what I have done except my method requires less lines of code, but Xstream probably can do a bit more. This is exactly what I was looking for in the beginning; I will check it out some more.

"i think you might like to have a look at: http://xfire.codehaus.org/Aegis+Binding"
- wayne
This is another XML-POJO binding, but it looks like XSD's are involved, which I don't want to have to use.

John Valentino said...

Since there seems to be some interest I have started a sourceforge.net project called "Really Easy Java XML Binding" or "RE:JAXB" since this is sort of a reply to JAXB. Here is the link

I have included POJOs for RSS 1.0 and RSS 2.0 along with unit tests to verify that the mappings from XML and to XML are working correctly. What I am looking for are ways to improve the efficiency of the to and from binding code, as well as POJOs for other common XML formats.

Zse said...

Have you tried out Castor?

http://www.castor.org/xml-mapping.html

It's been around forever, is stable, and is rather easy to use.

John Valentino said...

"Have you tried out Castor? http://www.castor.org/xml-mapping.html It's been around forever, is stable, and is rather easy to use."
- zse

I haven't looked at it before, but it looks like a good XML-POJO binding. I don't want to have to use external binding definitions though like with JiBX.

Mikhail Koryak said...

I hope you blog more frequently that you have been doing on average since 2005 =) Ill be checking this blog for more new betterer stuff

yesilay&danijay said...

You can try to use rhino script in java 6 that implements E4X like Flex.

Ivan said...

I think xml actually works better if you model it as a ResultSet and then iterate over nodes selected via Xpath and then use those to map.

The thing about placing Annotations on field names is that it fixes one type of xml format to your pojo. What about namespaces? Other twists on simple xml formats like atom?

Instead consider the following pseuedocode...


XmlDataSource xds = new XmlDataSource();
xds.namespace('custom', CUSTOM_URI);

XmlResultSet xrs = XmlClass.xpath('/rss/model/item, /rss/channel/custom:item');
List< Item > items = new ArrayList< Item >();
while (xrs.next()) {
Item item = new Item();
item.setTitle(xrs.getString('title');
item.setDescription(xrs.getString('description');
items.add(item);
}

// do something with the items...

With a little auto-naming magic present in many of todays frameworks, the actual loop could be done for you,
and you get an easy to understand, non-invasive way to get data out of xml.

Either way, great article. There are def a lot of people out there who find current mapping methods unacceptable.

John Valentino said...

This works with namespaces as well. For example you could use an annotation to map dc:author to a class field called dcAuthor.

Yes, I am looking to fix an XML format to a POJO for my particular use. This may not always be the case though as you have pointed out, where a ResultSet may be what you need instead.

S├ębastien said...

I used XStream recently and if you write a little wrapper class, it's really a one liner to go from objects to xml and back. The great thing is that you don't even need to touch your classes.

Tony said...

Let me be the fourth person to say XStream. Why reinvent the wheel, unless you are just doing it for kicks.

Davie Lang said...

Good article.

Along the same lines, I found SimpleXML excellent.

http://simple.sourceforge.net/home.php

Felipe Coury said...

Your library is the best thing since sliced bread. Awesome! Thank you!

Kalpesh Soni said...

I have used castor before
its awesome

I even use their marshal api to debug my java object values

:D

ygor said...

A big thanks to all for pointing to XStream. This is just, what I have been looking for.