Monday, February 14, 2011

Embedding XML within JAXB Objects

If a JAXB object contains a String element of un-parsed XML, it can be a little tricky to represent.

First, you need to stop JAXB from attempting to unmarshal the embedded XML. This is done by intercepting the call to unmarshal the element in order to gain access to raw DOM node that JAXB has constructed. From here, it's simply a matter of converting the node into a String, which is written to a field. As you'd expect, for marshalling you need to do the opposite, turning the XML String into a DOM node. However, you must wrap your XML in a dummy, root tag prior to returning it. For example:

The above code utilizes a utility class to wrap basic W3C XML parsing. It's not doing anything special, but, if you're curious, the source is available here.

Your code should now be able to marshal/unmarshal embedded XML to a String field. This works fine and dandy as long as your XML does not contain namespaces. Namespaces pose a couple of problems. First, if the namespaces aren't declared at the root of the document, JAXB will blowup while parsing. Second, if there is a declared, but unused, namespace, JAXB will not preserve it.

The first problem can be solved by scanning your XML prior to unmarshalling, and attaching any namespace declarations that are discovered to the root node. This can be done in a variety of ways. I have a solution that's based on DOM4J that's available in my commons project here.

The second issue is really only a problem if something else in your application is depending on having certain namespaces defined. Unfortunately, JAXB does not have a simple flag to toggle between behaviors, but there is a way to preserve unused namespaces. You have to create a SAXSource, set a couple of namespace related flags on it, and feed it to your unmarshaller. It should look something like this:

JAXB has more than its fair share of quirks, but at least you can force it to do just about anything.

2 comments:

  1. You rock!!!
    This worked perfectly for me.
    Interestingly, the first solution you posted also grabbed the namespaces as well.
    I'm using hyperjaxb3 to transform a bunch of xml documents into database. The trick with this was to include contentNode in the propOrder, and not the content element. The content element itself needs to be treated as XmlTransient, so jaxb doesn't complain.

    ReplyDelete
  2. Probably the only resource out there on this issue. Saved the day for me!

    Thank you

    ReplyDelete