Reputation: 1191
I want to get an integer number, in a string, in Java. The string is:
<a target="_blank" href="http://www.gazzetta.it/calcio/fantanews/statistiche/serie-a-2014-15/andrea_pirlo_669">Pirlo A.</a>
I want to get the value "669", which is between _ and ". I know it is possible to use StringTokenizer, but the code I write is not so good. Is there any simpler solution to do that ?
Upvotes: 0
Views: 81
Reputation: 1277
Or you cant use String split method (It will returns splitted string as array by regex in param)
">
_
array[array.length-1]
):)
Upvotes: 1
Reputation: 17595
Here is a solution using Jsoup:
public void extractNumber()
{
String s = "<a target=\"_blank\" href=\"http://www.gazzetta.it/calcio/fantanews/statistiche/serie-a-2014-15/andrea_pirlo_669\">Pirlo A.</a>";
Document document = Jsoup.parse(s);
System.out.println(document);
Elements elementsByTag = document.getElementsByTag("a");
String attr = elementsByTag.attr("href");
System.out.println(attr);
String sNumber = attr.substring(attr.lastIndexOf('_')+1);
System.out.println(Integer.parseInt(sNumber));
}
Note that elementsByTag
is a collection, you may want to iterate and do for every <a />
's href
attribute
Upvotes: 1
Reputation: 25873
You can solve it with regular expressions, using Pattern and Matcher classes. Here's an example:
private static final Pattern PATTERN = Pattern.compile(".*_(\\d{3})\".*");
public static void main(String[] args) throws ParseException {
final String input = "<a target=\"_blank\" href=\"http://www.gazzetta.it/calcio/fantanews/statistiche/serie-a-2014-15/andrea_pirlo_669\">Pirlo A.</a>";
final Matcher m = PATTERN.matcher(input);
if (m.matches()) {
System.out.println(m.group(1));
} else {
System.out.println("No match");
}
}
The regular expression is .*_(\\d{3})\".*
:
.*
-> Any number (including 0) of any characters
_
-> Character _
(\\d{3})
-> 3 digits. The parenthesis tells the regex engine to keep this match as a group, which will refer to it later as group 1 (m.group(1)
)
\"
-> Double quote character
.*
-> Any number (including 0) of any characters
Upvotes: 4
Reputation: 2070
This code works:
Pattern p = Pattern.compile(".*_(\\d+)\".*");
Matcher m = p.matcher(<yourstringhere>);
if (m.matches()) {
System.out.println(m.group(1));
}
To understand it, have a look at regular expressions, e.g. http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
Upvotes: 1