Reputation: 147
As I know, and I used very little java regex, there is a method (or tool) to convert a control xsd:pattern in java regex?
My xsd: pattern is as follows:
<xsd:simpleType name="myCodex">
<xsd:restriction base="xsd:string">
<xsd:pattern value="[A-Za-z]{6}[0-9]{2}[A-Za-z]{1}[0-9]{2}[A-Za-z]{1}[0-9A-Za-z]{3}[A-Za-z]{1}" />
<xsd:pattern value="[A-Za-z]{6}[0-9LMNPQRSTUV]{2}[A-Za-z]{1}[0-9LMNPQRSTUV]{2}[A-Za-z]{1}[0-9LMNPQRSTUV]{3}[A-Za-z]{1}" />
<xsd:pattern value="[0-9]{11,11}" />
</xsd:restriction>
</xsd:simpleType>
Upvotes: 5
Views: 2033
Reputation: 23627
You can load the XSD into Java and extract the expressions. Then you can use them in .matches()
methods or create Pattern
objects if you are going to reuse them a lot.
First you need to load the XML into a Java program (I called it CodexSchema.xsd
):
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document source = builder.parse(new File("CodexSchema.xsd"));
Then you can use XPath to find the patterns you want to extract (you might want to create a method that takes the name of the simple type, if you have many to process). I used a more complicated XPath expression to avoid registering the namespaces:
XPathFactory xPathfactory = XPathFactory.newInstance();
String typeName = "myCodex";
String xPathRoot = "//*[local-name()='simpleType'][@name='"+typeName+"']/*[local-name()='restriction']/*[local-name()='pattern']";
XPath patternsXPath = xPathfactory.newXPath(); // this represents the NodeList of <xs:pattern> elements
Running that expression you get org.xml.dom.NodeList
containing the <xs:pattern>
elements.
NodeList patternNodes = (NodeList)patternsXPath.evaluate(xPathRoot, source, XPathConstants.NODESET);
Now you can loop through them and extract the contents of their value
attribute. You might want to write a method for that:
public List<Pattern> getPatterns(NodeList patternNodes) {
List<Pattern> expressions = new ArrayList<>();
for(int i = 0; i < patternNodes.getLength(); i++) {
Element patternNode = (Element)patternNodes.item(i);
String regex = patternNode.getAttribute("value");
expressions.add(Pattern.compile(regex));
}
return expressions;
}
You don't really need to put them into Pattern
. You could simply use String
.
You can now read all your patterns in Java using:
for(Pattern p : getPatterns(patternNodes)) {
System.out.println(p);
}
Here are some tests with the third pattern:
Pattern pattern3 = getPatterns(patternNodes).get(2);
Matcher matcher = pattern3.matcher("47385628403");
System.out.println("test1: " + matcher.find()); // prints `test1: true`
System.out.println("test2: " + "47385628403".matches(pattern3.toString())); // prints `test2: true`
Upvotes: 2