Use local resources when validating XML

Depending of you enterprise security policy, some - if not most of your middleware servers have no access to Internet. It’s even worse when your development infrastructure is isolated from the Internet (such as banks or security companies). In this case, validating your XML against schemas becomes a real nightmare.

Of course, you could set the XML schema location to a location on your hard drive. But about about your co-workers then? They would have to have the schema in exactly the same filesystem hierarchy, and that wouldn’t solve your problem about the production environment…​

XML catalogs

XML catalogs are the nominal solution for your quandary. In effect, you plug in a resolver that knows how to map an online location to a location on your filesystem (or more precisely a location to another one). This resolver is set through the xml.catalog.files system property that may be different for each developer or in the production environment:

<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
    <system systemId="http://blog.frankel.ch/xmlcatalog/person.xsd" uri="person.xsd" />
    <system systemId="http://blog.frankel.ch/xmlcatalog/order.xsd" uri="order.xsd" />

Now, when validation happens, instead of looking up the online schema URL location, it looks up the local schema.


XML catalogs can be used with JAXP, either SAX, DOM or STaX. The following details the code how to use them with SAX.

SAXParserFactory factory = SAXParserFactory.newInstance();
XMLReader reader = factory.newSAXParser().getXMLReader();

// Set XSD as the schema language
reader.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema"); (2)

// Use XML catalogs
reader.setEntityResolver(new CatalogResolver()); (1)
InputStream stream = getClass().getClassLoader().getResourceAsStream("person.xml");
reader.parse(new InputSource(stream));

Two lines are important:

1 we tell the reader to use XML catalogs
2 we absolutely have to use the reader, not the SAX parser to parse the XML or validation will be done online!

Use cases

There are basically two use cases for XML catalogs:

  1. As seen above, the first use-case is to validate XML files against cached local schemas files
  2. In addition to that, XML catalogs can also be used as an alternative to LSResourceResolver (as seen previously in XML validation with imported/included schemas)

Important note

Beware: the CatalogResolver class is available in the JDK in an internal com.sun package, but the Apache XML resolver library does the job. In fact, the internal class is just a repackaging of the Apache one.

If you use Maven, the dependency is the following:


Beyond catalogs

Catalog are a file-system based standard way to map an online schema location.

However, this isn’t always the best strategy to choose from. For example, Spring provides its different schemas inside JARs, along classes and validation is done against those schemas.

In order to do the same, replace line 13 in the code above, and instead of a CatalogResolver, use your own implementation of EntityResolver.

You can find the source for this article here (note that associated tests use a custom security policy file to prevent network access, and if you want to test directly inside your IDE, you should reuse the system properties used inside the POM).

Nicolas Fränkel

Nicolas Fränkel

Developer Advocate with 15+ years experience consulting for many different customers, in a wide range of contexts (such as telecoms, banking, insurances, large retail and public sector). Usually working on Java/Java EE and Spring technologies, but with focused interests like Rich Internet Applications, Testing, CI/CD and DevOps. Also double as a trainer and triples as a book author.

Read More
Use local resources when validating XML
Share this