John Manak
John Manak

Reputation: 1215

How to extract numbers from a string and get an array of ints?

I have a String variable (basically an English sentence with an unspecified number of numbers) and I'd like to extract all the numbers into an array of integers. I was wondering whether there was a quick solution with regular expressions?


I used Sean's solution and changed it slightly:

LinkedList<String> numbers = new LinkedList<String>();

Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(line); 
while (m.find()) {
   numbers.add(m.group());
}

Upvotes: 118

Views: 291675

Answers (13)

Mugoma J. Okomba
Mugoma J. Okomba

Reputation: 3295

The accepted answer detects digits but does not detect formated numbers, e.g. 2,000, nor decimals, e.g. 4.8. For such use -?\\d+(,\\d+)*?\\.?\\d+?:

Pattern p = Pattern.compile("-?\\d+(,\\d+)*?\\.?\\d+?");
List<String> numbers = new ArrayList<String>();
Matcher m = p.matcher("Government has distributed 4.8 million textbooks to 2,000 schools");
while (m.find()) {  
    numbers.add(m.group());
}   
System.out.println(numbers);

Output: [4.8, 2,000]

Upvotes: 8

user3509903
user3509903

Reputation: 65

public static String extractNumberFromString(String number) {
    String num = number.replaceAll("[^0-9]+", " ");
    return num.replaceAll(" ", "");
}

extracts only numbers from string

Upvotes: 0

The_Fresher
The_Fresher

Reputation: 77

I would suggest to check the ASCII values to extract numbers from a String Suppose you have an input String as myname12345 and if you want to just extract the numbers 12345 you can do so by first converting the String to Character Array then use the following pseudocode

    for(int i=0; i < CharacterArray.length; i++)
    {
        if( a[i] >=48 && a[i] <= 58)
            System.out.print(a[i]);
    }

once the numbers are extracted append them to an array

Hope this helps

Upvotes: 1

dxl
dxl

Reputation: 231

If you want to exclude numbers that are contained within words, such as bar1 or aa1bb, then add word boundaries \b to any of the regex based answers. For example:

Pattern p = Pattern.compile("\\b-?\\d+\\b");
Matcher m = p.matcher("9There 9are more9 th9an -2 and less than 12 numbers here9");
while (m.find()) {
  System.out.println(m.group());
}

displays:

2
12

Upvotes: 1

Maxim Shoustin
Maxim Shoustin

Reputation: 77904

What about to use replaceAll java.lang.String method:

    String str = "qwerty-1qwerty-2 455 f0gfg 4";      
    str = str.replaceAll("[^-?0-9]+", " "); 
    System.out.println(Arrays.asList(str.trim().split(" ")));

Output:

[-1, -2, 455, 0, 4]

Description

[^-?0-9]+
  • [ and ] delimites a set of characters to be single matched, i.e., only one time in any order
  • ^ Special identifier used in the beginning of the set, used to indicate to match all characters not present in the delimited set, instead of all characters present in the set.
  • + Between one and unlimited times, as many times as possible, giving back as needed
  • -? One of the characters “-” and “?”
  • 0-9 A character in the range between “0” and “9”

Upvotes: 55

AnDus
AnDus

Reputation: 89

Fraction and grouping characters for representing real numbers may differ between languages. The same real number could be written in very different ways depending on the language.

The number two million in German

2,000,000.00

and in English

2.000.000,00

A method to fully extract real numbers from a given string in a language agnostic way:

public List<BigDecimal> extractDecimals(final String s, final char fraction, final char grouping) {
    List<BigDecimal> decimals = new ArrayList<BigDecimal>();
    //Remove grouping character for easier regexp extraction
    StringBuilder noGrouping = new StringBuilder();
    int i = 0;
    while(i >= 0 && i < s.length()) {
        char c = s.charAt(i);
        if(c == grouping) {
            int prev = i-1, next = i+1;
            boolean isValidGroupingChar =
                    prev >= 0 && Character.isDigit(s.charAt(prev)) &&
                    next < s.length() && Character.isDigit(s.charAt(next));                 
            if(!isValidGroupingChar)
                noGrouping.append(c);
            i++;
        } else {
            noGrouping.append(c);
            i++;
        }
    }
    //the '.' character has to be escaped in regular expressions
    String fractionRegex = fraction == POINT ? "\\." : String.valueOf(fraction);
    Pattern p = Pattern.compile("-?(\\d+" + fractionRegex + "\\d+|\\d+)");
    Matcher m = p.matcher(noGrouping);
    while (m.find()) {
        String match = m.group().replace(COMMA, POINT);
        decimals.add(new BigDecimal(match));
    }
    return decimals;
}

Upvotes: 1

Shankara Narayana
Shankara Narayana

Reputation: 765

Extract all real numbers using this.

public static ArrayList<Double> extractNumbersInOrder(String str){

    str+='a';
    double[] returnArray = new double[]{};

    ArrayList<Double> list = new ArrayList<Double>();
    String singleNum="";
    Boolean numStarted;
    for(char c:str.toCharArray()){

        if(isNumber(c)){
            singleNum+=c;

        } else {
            if(!singleNum.equals("")){  //number ended
                list.add(Double.valueOf(singleNum));
                System.out.println(singleNum);
                singleNum="";
            }
        }
    }

    return list;
}


public static boolean isNumber(char c){
    if(Character.isDigit(c)||c=='-'||c=='+'||c=='.'){
        return true;
    } else {
        return false;
    }
}

Upvotes: 1

user2902302
user2902302

Reputation: 91

I found this expression simplest

String[] extractednums = msg.split("\\\\D++");

Upvotes: 0

Bernhard Barker
Bernhard Barker

Reputation: 55609

Using Java 8, you can do:

String str = "There 0 are 1 some -2-34 -numbers 567 here 890 .";
int[] ints = Arrays.stream(str.replaceAll("-", " -").split("[^-\\d]+"))
                 .filter(s -> !s.matches("-?"))
                 .mapToInt(Integer::parseInt).toArray();
System.out.println(Arrays.toString(ints)); // prints [0, 1, -2, -34, 567, 890]

If you don't have negative numbers, you can get rid of the replaceAll (and use !s.isEmpty() in filter), as that's only to properly split something like 2-34 (this can also be handled purely with regex in split, but it's fairly complicated).

Arrays.stream turns our String[] into a Stream<String>.

filter gets rid of the leading and trailing empty strings as well as any - that isn't part of a number.

mapToInt(Integer::parseInt).toArray() calls parseInt on each String to give us an int[].


Alternatively, Java 9 has a Matcher.results method, which should allow for something like:

Pattern p = Pattern.compile("-?\\d+");
Matcher m = p.matcher("There 0 are 1 some -2-34 -numbers 567 here 890 .");
int[] ints = m.results().map(MatchResults::group).mapToInt(Integer::parseInt).toArray();
System.out.println(Arrays.toString(ints)); // prints [0, 1, -2, -34, 567, 890]

As it stands, neither of these is a big improvement over just looping over the results with Pattern / Matcher as shown in the other answers, but it should be simpler if you want to follow this up with more complex operations which are significantly simplified with the use of streams.

Upvotes: 5

Sean Owen
Sean Owen

Reputation: 66886

Pattern p = Pattern.compile("-?\\d+");
Matcher m = p.matcher("There are more than -2 and less than 12 numbers here");
while (m.find()) {
  System.out.println(m.group());
}

... prints -2 and 12.


-? matches a leading negative sign -- optionally. \d matches a digit, and we need to write \ as \\ in a Java String though. So, \d+ matches 1 or more digits.

Upvotes: 188

Kannan
Kannan

Reputation: 109

  StringBuffer sBuffer = new StringBuffer();
  Pattern p = Pattern.compile("[0-9]+.[0-9]*|[0-9]*.[0-9]+|[0-9]+");
  Matcher m = p.matcher(str);
  while (m.find()) {
    sBuffer.append(m.group());
  }
  return sBuffer.toString();

This is for extracting numbers retaining the decimal

Upvotes: 10

sidereal
sidereal

Reputation: 1120

Pattern p = Pattern.compile("[0-9]+");
Matcher m = p.matcher(myString);
while (m.find()) {
    int n = Integer.parseInt(m.group());
    // append n to list
}
// convert list to array, etc

You can actually replace [0-9] with \d, but that involves double backslash escaping, which makes it harder to read.

Upvotes: 20

Andrey
Andrey

Reputation: 60065

for rational numbers use this one: (([0-9]+.[0-9]*)|([0-9]*.[0-9]+)|([0-9]+))

Upvotes: 5

Related Questions