Use local resources when validating XML
Depending of you enterprise security policy, some – if not most of your middleware servers have no access to Internet. It’s even worse when your development infrastructure is isolated from the Internet (such as banks or security companies). In this case, validating your XML against schemas becomes a real nightmare.
Of course, you could set the XML schema location to a location on your hard drive. But about about your co-workers then? They would have to have the schema in exactly the same filesystem hierarchy, and that wouldn’t solve your problem about the production environment…
XML catalogs are the nominal solution for your quandary. In effect, you plug in a resolver that knows how to map an online location to a location on your filesystem (or more precisely a location to another one). This resolver is set through the
xml.catalog.files system property that may be different for each developer or in the production environment:
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> <system systemId="http://blog.frankel.ch/xmlcatalog/person.xsd" uri="person.xsd" /> <system systemId="http://blog.frankel.ch/xmlcatalog/order.xsd" uri="order.xsd" /> </catalog>
Now, when validation happens, instead of looking up the online schema URL location, it looks up the local schema.
XML catalogs can be used with JAXP, either SAX, DOM or STaX. The following details the code how to use them with SAX.
SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setNamespaceAware(true); factory.setValidating(true); XMLReader reader = factory.newSAXParser().getXMLReader(); // Set XSD as the schema language reader.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema"); // Use XML catalogs reader.setEntityResolver(new CatalogResolver()); InputStream stream = getClass().getClassLoader().getResourceAsStream("person.xml"); reader.parse(new InputSource(stream));
Two lines are important:
- In line 13, we tell the reader to use XML catalogs
- But line 7 is almost as important: we absolutely have to use the reader, not the SAX parser to parse the XML or validation will be done online!
There are basically two use cases for XML catalogs:
- As seen above, the first use-case is to validate XML files against cached local schemas files
- In addition to that, XML catalogs can also be used as an alternative to
LSResourceResolver(as seen previously in XML validation with imported/included schemas)
CatalogResolver class is available in the JDK in an internal
com.sun package, but the Apache XML resolver library does the job. In fact, the internal class is just a repackaging of the Apache one.
If you use Maven, the dependency is the following:
<dependency> <groupId>xml-resolver</groupId> <artifactId>xml-resolver</artifactId> <version>1.2</version> </dependency>
Catalog are a file-system based standard way to map an online schema location.
However, this isn’t always the best strategy to choose from. For example, Spring provides its different schemas inside JARs, along classes and validation is done against those schemas.
In order to do the same, replace line 13 in the code above, and instead of a
CatalogResolver, use your own implementation of EntityResolver.
You can find the source for this article here (note that associated tests use a custom security policy file to prevent network access, and if you want to test directly inside your IDE, you should reuse the system properties used inside the POM).