Reputation: 981
Hi I am working on a project in Cloud computing an amazon. Part of code where am stuck at is getting user wish list from amazon. Since there are permissions restrictions what I did was extracted the entire page source given the wish list url. To extract the itemID I used pattern compile like
Pattern p = Pattern.compile("/dp/(\\w+)/");
Matcher matcher = p.matcher(content);
This was easy and it now correctly lists all the products with their itemId in that wish list. I also need the price for each. According to page source the price is
<span class="a-size-base a-color-price a-text-bold">
$7.19
</span>
I need to write a pattern for this one and am all confused and stuck.I suck at Regex expressions. Could anyone help please. I saw online references for href, but I don't think that will work for me.
Thanks to dkatzel I found this tool Jsoup. I tried the online conversion at Online Jsoup Try so when I do CSS Query div I get the required output. But how do I hard code it in my java program. I have the jsoup jar.
Upvotes: 0
Views: 525
Reputation: 8879
An alternative answer where Jsoup is used.
Element e = doc.select("span.a-size-base").first();
Include jsoup-1.x.x.jar
in your project or when you compile, and add the following imports.
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
Upvotes: 3
Reputation: 71578
Wouldn't a simple expression work?
\\$\\d+(?:\\.\\d+)
\\$
matches a literal $
.
\\d+
matches digits.
(?:\\.\\d+)
matches potential decimals.
The whole match is what you're looking for I guess, unless you don't need the dollar symbol, then you can use either a capture group and take the first group (i.e. \\$(\\d+(?:\\.\\d+))
) or a lookbehind (i.e. (?<=\\$)\\d+(?:\\.\\d+)
)
Upvotes: 1