Reputation: 1
I am trying to search an element in a string using Pattern and matcher in Java.
i have a node variant-items and need to get all the characters coming between these nodes. i tried the below regex but it is skipping this line altogether. however if i search using the same regex in Notepad++ i am getting the desired resulted selected. please advice.
<variant-items>((.|\n)*)</variant-items>
Below is my implemenation
String patternSourceComponent = "<variant-items>((.|\n)*)</variant-items>";
String result=this.isMatched(patternSourceComponent, xml);
public String isMatched(String patternSourceComponent,String xml)
{
String varientItem="";
try{ Pattern patternComponent = Pattern.compile(patternSourceComponent);
Matcher matcherComponent = patternComponent.matcher(xml);
System.out.println("matcherComponent Find : "+matcherComponent.find());
while (matcherComponent.find()) {
varientItem=matcherComponent.group(0).trim();
System.out.println("varientItem : "+varientItem);
} }
catch (Exception e)
{
System.out.println("Exception : "+e);
}
return varientItem;
}
Upvotes: 0
Views: 837
Reputation: 12992
I would personally use Java DOM to checks your nodes. Using regex for XML is a nightmare, and any code attempting it is very likely to break in the future. Try something like this to get the string contents of your 'variant-items' nodes.
File xmlFile = new File("your_xml.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
NodeList nList = doc.getElementsByTagName("variant-items");
for (int i = 0; i < nList.getLength(); i++) {
Node node = nList.item(i);
System.out.println(node.getNodeValue());
}
The above code prints the values of all 'variant-items' nodes in an xml file.
If resources/speed considerations are an issue (like when your_xml.xml is huge), you might be better off using SAX, which is faster (a little more code intensive) and doesn't store the XML in memory.
Upvotes: 1