Reputation: 371
Got an http response I need to parse, more precisely I wanna get a part of the response based on a tag. Let's say:
<div class="row"><span>some text<pre>% Copyright (c) </pre></span></div>
So I'd pass "pre" and the parser would return the block between
<pre></pre>.
Is there a better way to do this in java? I don't understand if HttpMessageParser could do it for me.
Thanks in advance!
Upvotes: 0
Views: 276
Reputation: 14228
Your input seems to be a valid xml, using XPath is a easy and clean approach :
The xpath would be //pre/text()
- searches for pre
and retrieves its text content.
String input = "<div class=\"row\"><span>some text<pre>% Copyright (c) </pre></span></div>";
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xpath = xPathFactory.newXPath();
try {
XPathExpression expr = xpath.compile( "//pre/text()" );
Object output = expr.evaluate( new InputSource(new StringReader(input)), XPathConstants.STRING);
System.out.println(output.toString());
} catch (XPathExpressionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Upvotes: 0
Reputation: 83517
I don't think HttpMessageParser
is the correct tool here because this is intended for parsing HTTP messages regardless of whether they contain HTML. For simple parsing, you can use methods from the String
class, such as substring()
and indexOf()
. For a bit more complex parsing, you can use regular expressions. If you need something that actually recognizes HTML syntax, I suggest that you google for an HTML parser library.
Upvotes: 2
Reputation: 3570
Assuming there can be only one pre
tag in the response, you can use the substring
method to get what you want.
String response="<div class=\"row\"><span>some text<pre>% Copyright (c) </pre></span></div>";
String insidePre=response.substring(response.indexOf("<pre>")+4,response.indexOf("</pre>"));
Upvotes: 2