Home > Java > Use local resources when validating XML

Use local resources when validating XML

Depending of you enterprise security policy, some – if not most of your middleware servers have no access to Internet. It’s even worse when your development infrastructure is isolated from the Internet (such as banks or security companies). In this case, validating your XML against schemas becomes a real nightmare.

Of course, you could set the XML schema location to a location on your hard drive. But about about your co-workers then? They would have to have the schema in exactly the same filesystem hierarchy, and that wouldn’t solve your problem about the production environment…

XML catalogs

XML catalogs are the nominal solution for your quandary. In effect, you plug in a resolver that knows how to map an online location to a location on your filesystem (or more precisely a location to another one). This resolver is set through the xml.catalog.files system property that may be different for each developer or in the production environment:



    
    

Now, when validation happens, instead of looking up the online schema URL location, it looks up the local schema.

How-to

XML catalogs can be used with JAXP, either SAX, DOM or STaX. The following details the code how to use them with SAX.

SAXParserFactory factory = SAXParserFactory.newInstance();

factory.setNamespaceAware(true);
factory.setValidating(true);

XMLReader reader = factory.newSAXParser().getXMLReader();

// Set XSD as the schema language
reader.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema");

// Use XML catalogs
reader.setEntityResolver(new CatalogResolver());

InputStream stream = getClass().getClassLoader().getResourceAsStream("person.xml");

reader.parse(new InputSource(stream));

Two lines are important:

  • In line 13, we tell the reader to use XML catalogs
  • But line 7 is almost as important: we absolutely have to use the reader, not the SAX parser to parse the XML or validation will be done online!

Use cases

There are basically two use cases for XML catalogs:

  1. As seen above, the first use-case is to validate XML files against cached local schemas files
  2. In addition to that, XML catalogs can also be used as an alternative to LSResourceResolver (as seen previously in XML validation with imported/included schemas)

Important note

Beware: the CatalogResolver class is available in the JDK in an internal com.sun package, but the Apache XML resolver library does the job. In fact, the internal class is just a repackaging of the Apache one.

If you use Maven, the dependency is the following:


    xml-resolver
    xml-resolver
    1.2

Beyond catalogs

Catalog are a file-system based standard way to map an online schema location.

However, this isn’t always the best strategy to choose from. For example, Spring provides its different schemas inside JARs, along classes and validation is done against those schemas.

In order to do the same, replace line 13 in the code above, and instead of a CatalogResolver, use your own implementation of EntityResolver.

You can find the source for this article here (note that associated tests use a custom security policy file to prevent network access, and if you want to test directly inside your IDE, you should reuse the system properties used inside the POM).

email
Send to Kindle
Categories: Java Tags:
  1. Wim
    October 23rd, 2012 at 11:19 | #1

    This CatalogResolver will only work if there’s a xsi:schemaLocation or xsi:noSchemaLocation declared in the root element (such that it can map from systemId or publicId to the local version).

    I was wondering if there’s a way to validate xml documents against a locally mapped xsd, when the xml document only declares a xmlns attribute? I tested your code, and it fails as well when the xsi:schemaLocation attributes are removed from person.xml and order.xml (I also adapted catalog.xml to use ‘uri’ mappings from xmlns to local xsd, but that didn’t help much).

  1. No trackbacks yet.