Reputation: 3871
I know that there are tons of questions with issues related to this topic which is regex, but I've been trying to fill a requirement for an URL. The URL comes as follows:
POST /fr.synomia.search.ws.module.ModuleSearch/geResults/jsonp?xmlQuery=<?xml version='1.0' encoding='UTF-8'?><query ids="16914"><matchWord>avoir</matchWord><fullText><![CDATA[]]></fullText><quotedText><![CDATA[]]></quotedText><sensitivity></sensitivity><operator>AND</operator><offsetCooc>0</offsetCooc><cooc></cooc><collection>0</collection><searchOn>all</searchOn><nbResultDisplay>10</nbResultDisplay><nbResultatsParAspect>5</nbResultatsParAspect><nbCoocDisplay>8</nbCoocDisplay><offsetDisplay>0</offsetDisplay><sortBy>date</sortBy><dateAfter>0</dateAfter><dateBefore>0</dateBefore><ipClient>82.122.169.244</ipClient><typeQuery>0</typeQuery><equivToDelete></equivToDelete><allCooc>false</allCooc><versionDTD>3.0.5</versionDTD><r34>1tcbet30]</r34><mi>IND</mi></query>&callback=__gwt_jsonp__.P1.onSuccess&failureCallback=__gwt_jsonp__.P1.onFailure HTTP/1.1
It is an URL requested to a REST WS, in the structure of this url, we can find a tag: <query ids="16914">
I want to extract that number 16914 from the whole URL, the regex I tried to implement is the following:
private static Pattern p = Pattern.compile(
"<\\?xml version='1.0' encoding='[^']+'\\?><query ids=\"([0-9]+)\"><matchWord>.*");
I tried with some tools like Debuggex but I can't manage to find what could be the problem, I prefer to use regex instead of working with a lot of methods from the String class.
I would really appreciate any help. Thanks a lot in advance.
Upvotes: 0
Views: 188
Reputation: 2553
There is nothing wrong with your regex, it works for me.
String s = "POST /fr.synomia.search.ws.module.ModuleSearch/geResults/jsonp?xmlQuery=<?xml version='1.0' encoding='UTF-8'?><query ids=\"16914\"><matchWord>avoir</matchWord><fullText><![CDATA[]]></fullText><quotedText><![CDATA[]]></quotedText><sensitivity></sensitivity><operator>AND</operator><offsetCooc>0</offsetCooc><cooc></cooc><collection>0</collection><searchOn>all</searchOn><nbResultDisplay>10</nbResultDisplay><nbResultatsParAspect>5</nbResultatsParAspect><nbCoocDisplay>8</nbCoocDisplay><offsetDisplay>0</offsetDisplay><sortBy>date</sortBy><dateAfter>0</dateAfter><dateBefore>0</dateBefore><ipClient>82.122.169.244</ipClient><typeQuery>0</typeQuery><equivToDelete></equivToDelete><allCooc>false</allCooc><versionDTD>3.0.5</versionDTD><r34>1tcbet30]</r34><mi>IND</mi></query>&callback=__gwt_jsonp__.P1.onSuccess&failureCallback=__gwt_jsonp__.P1.onFailure HTTP/1.1";
Pattern p = Pattern.compile(
"<\\?xml version='1.0' encoding='[^']+'\\?><query ids=\"([0-9]+)\"><matchWord>.*");
Matcher m = p.matcher(s);
if (m.find()) {
System.out.println("Group: "+m.group(1));
}
Prints:
Group: 16914
Upvotes: 1
Reputation: 34677
I'd use SAX for this purpose:
public class XMLParser extends DefaultHandler {
int id;
public void startElement(String ns, String qName, String localName, Attributes attrs) throws SAXException {
if (qName.equals("query")) {
id = Integer.parseInt(attrs.getValue("id"));
}
}
public String toString() {
return String.format("%d", this.id);
}
public static void main(String[] args) throws Exception {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
XMLParser parserObj = new XMLParser();
parser.parse(new FileReader(args[0], parserObj);
System.out.println(parserObj);
}
}
Upvotes: 1