Reputation: 3910
I have a bunch of urls that share the following pattern:
http://www.ebay.com/itm/Crosman-Pumpmaster-760-Pump-177-Pellet-4-5-mm-BB-Air-Rifle-Black-760B-/251635693266?pt=LH_DefaultDomain_0&hash=item3a96a7f6d2
I want to extract item3a96a7f6d2
. The http://www.ebay.com/itm/
and &hash=
are fixed patterns while the string in between can change. I wrote:
String prodPatternString = "(http://www.ebay.com/itm/)(.*?)(hash=)(.*?)";
Pattern prodPattern = Pattern.compile(prodPatternString);
Matcher prodMatcher = prodPattern.matcher(prodUrl);
while(prodMatcher.find()){
String pid = matcher.group(4);
}
But it gives me an error saying "No match found". Any help will be greatly appreciated. Thanks.
Upvotes: 0
Views: 291
Reputation: 174696
You need to change matcher.group(4);
line to prodMatcher.group(4);
and then remove the ?
present inside the last capturing group because .*?
will do a non-greedy match of zero or more characters, so it would match also an empty string even though characters present since it's in non-greedy form.
String prodUrl = "http://www.ebay.com/itm/Crosman-Pumpmaster-760-Pump-177-Pellet-4-5-mm-BB-Air-Rifle-Black-760B-/251635693266?pt=LH_DefaultDomain_0&hash=item3a96a7f6d2";
String prodPatternString = "(http://www.ebay.com/itm/)(.*?)(hash=)(.*)";
Pattern prodPattern = Pattern.compile(prodPatternString);
Matcher prodMatcher = prodPattern.matcher(prodUrl);
while(prodMatcher.find()){
String pid = prodMatcher.group(4);
System.out.println(pid);
}
Output:
item3a96a7f6d2
Upvotes: 1
Reputation: 784898
You can use this regex:
(http://www.ebay.com/itm/)(.*?)(hash=)([^&]*)
.*?
is matching too little in the 4th capturing group in your regex.
Upvotes: 0
Reputation: 177
You should check out the lastindexof method. Then you can substring the url starting at the last index of '&hash=' and ending at the full length of the string. This will get the item=x
Upvotes: 0