Reputation: 179
I am reading an html file from the internet and when I read the file, the output to my console is as follows:
<string>
<String1>
text
</String1>
<level2>
text2
</level2>
<level3>
text3
</level3>
<level4>
text4
</level4>
<level5>
TEXT
</level5>
</string>
<string>
<String2>
text
</String2>
<level2>
text2
</level2>
<level3>
text3
</level3>
<level4>
text4
</level4>
<level5>
THIS TEXT
</level5>
</string>
How can I access the level5 text in the second string? I have been trying all day with no luck and would really appreciate some input from someone who knows more about this.
Here is my code:
String line = null;
try {
// FileReader reads text files in the default encoding.
FileReader fileReader = new FileReader(String.valueOf(doc));
// Always wrap FileReader in BufferedReader.
BufferedReader bufferedReader = new BufferedReader(fileReader);
while ((line = bufferedReader.readLine()) != null) {
Elements tdElements = doc.getElementsByTag("level1");
for(Element element : tdElements )
{
//Print the value of the element
System.out.println(element.text());
}
}
// Always close files.
bufferedReader.close();
} catch (FileNotFoundException ex) {
System.out.println(
"Unable to open file '" +
doc + "'");
} catch (IOException ex) {
System.out.println(
"Error reading file '"
+ doc + "'");
// Or we could just do this:
// ex.printStackTrace();
}
}
//
catch (IOException e) {
e.printStackTrace();
}
Upvotes: 3
Views: 453
Reputation: 43013
You can use a CSS selector here:
string:nth-of-type(2) > level5
DEMO: http://try.jsoup.org/~8w_pfCxDhJwIseTKiKsQjQJOBRs
string:nth-of-type(2) /* Select the 2nd string node in document... */
> level5 /* ... then select all "level5" child nodes */
Document doc = ...
Element level5Node = doc.select("string:nth-of-type(2) > level5").first();
if (level5Node ==null) {
throw new RuntimeException("Unable to locate level5 text...");
}
System.out.println(level5Node.text()); // THIS TEXT
Upvotes: 1
Reputation: 3079
Solution 1: you html is valid XML: use XML tools:
you can get your second level5 with XPath: "//string[2]/level5"
Solution 2: parse it with Jsoup and get the document then use Xpath as solution 1
See Jsoup with XPath / XSoup: Does jsoup support xpath?
Solution 1:
String xml="<root>"+your xml+"</root>";
DocumentBuilderFactory builderFactory =DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));
XPath xPath = XPathFactory.newInstance().newXPath();
String expression="//string[2]/level5";
String value = xPath.evaluate(expression, document);
System.out.println("EVALUATE:"+value);
Upvotes: 0
Reputation: 106
The code below uses JSoup to parse the text you were referring to. The variable 'textToParse' is the above html code that you provided. You can use JSoup's Psuedo selectors to find elements in a specific position in the DOM tree. Hope this is what you were looking for.
Document document = Jsoup.parse(textToParse);
Elements stringTags = document.select("string:eq(1)");
for(Element e : stringTags) {
System.out.println(e.select("level5").text());
}
//Output: THIS TEXT
Upvotes: 1