Reputation: 361
I am looking to extract a part of a string using xPath.
Full string -
Informational (nonfiction), 1,303 words, Level S (Grade 3)
HTML code:
<div class="bookInfo">
Informational (nonfiction),
1,303 words,
Level S (Grade 3)
</div>
I am looking to extract just the number of words from these strings, i.e. - 1,303 words
in this case
The xPath of this string looks like
//*[@id="contentarea-inner"]/div[3]/div[2]/div
Webpage in question - https://www.readinga-z.com/books/leveled-books/book/?id=820
Please advise on how I can modify the xPath so as to extract only the number of words from the page. I have several thousand pages to get this info from
Thanks
Upvotes: 2
Views: 826
Reputation: 17563
You can achieve same using split function in java
Use the code:-
String text= driver.findElement(By.xpath("//*[@id='contentarea-inner']/div[3]/div[2]/div")).getText();
String count1 = text.split(",")[1];
String count2 = text.split(",")[2];
String count = count1 + count2;
System.out.println(count);
Please get back to me if still facing any issue :)
Upvotes: 1
Reputation: 6277
Basically you need both xpath and regex:
\s[,\d]+(?= words)
. See the regex's work on the text node.Upvotes: 1