I love coding
I love coding

Reputation: 1191

How to extract an integer number in a string, within a specific characters ?

I want to get an integer number, in a string, in Java. The string is:

<a target="_blank" href="http://www.gazzetta.it/calcio/fantanews/statistiche/serie-a-2014-15/andrea_pirlo_669">Pirlo A.</a>

I want to get the value "669", which is between _ and ". I know it is possible to use StringTokenizer, but the code I write is not so good. Is there any simpler solution to do that ?

Upvotes: 0

Views: 81

Answers (4)

xxxvodnikxxx
xxxvodnikxxx

Reputation: 1277

Or you cant use String split method (It will returns splitted string as array by regex in param)

  1. split by ">
  2. first result of 1. split by _
  3. number will be last item in result array (array[array.length-1])

:)

Upvotes: 1

A4L
A4L

Reputation: 17595

Here is a solution using Jsoup:

public void extractNumber()
{
    String s = "<a target=\"_blank\" href=\"http://www.gazzetta.it/calcio/fantanews/statistiche/serie-a-2014-15/andrea_pirlo_669\">Pirlo A.</a>";
    Document document = Jsoup.parse(s);
    System.out.println(document);
    Elements elementsByTag = document.getElementsByTag("a");
    String attr = elementsByTag.attr("href");
    System.out.println(attr);
    String sNumber = attr.substring(attr.lastIndexOf('_')+1);
    System.out.println(Integer.parseInt(sNumber));
}

Note that elementsByTag is a collection, you may want to iterate and do for every <a />'s href attribute

Upvotes: 1

m0skit0
m0skit0

Reputation: 25873

You can solve it with regular expressions, using Pattern and Matcher classes. Here's an example:

private static final Pattern PATTERN = Pattern.compile(".*_(\\d{3})\".*");

public static void main(String[] args) throws ParseException {
    final String input = "<a target=\"_blank\" href=\"http://www.gazzetta.it/calcio/fantanews/statistiche/serie-a-2014-15/andrea_pirlo_669\">Pirlo A.</a>";
    final Matcher m = PATTERN.matcher(input);
    if (m.matches()) {
        System.out.println(m.group(1));
    } else {
        System.out.println("No match");
    }
}

The regular expression is .*_(\\d{3})\".*:

.* -> Any number (including 0) of any characters

_ -> Character _

(\\d{3}) -> 3 digits. The parenthesis tells the regex engine to keep this match as a group, which will refer to it later as group 1 (m.group(1))

\" -> Double quote character

.* -> Any number (including 0) of any characters

Upvotes: 4

Jan
Jan

Reputation: 2070

This code works:

    Pattern p = Pattern.compile(".*_(\\d+)\".*");
    Matcher m = p.matcher(<yourstringhere>);
    if (m.matches()) {
        System.out.println(m.group(1));
    }

To understand it, have a look at regular expressions, e.g. http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

Upvotes: 1

Related Questions