dshil
dshil

Reputation: 381

Java String remove all non numeric characters and word like var123

I have a String:

String s = "12 text var2 14 8v 1";

I need to get only numbers from this string like:

12 14 1.

But I don't need words like:

var2 and 8v.c  

I tried this:

str = str.replaceAll("[^\\d.]", "");`

Upvotes: 0

Views: 1888

Answers (5)

gkrls
gkrls

Reputation: 2664

You can use the Scanner class to scan every word in the sentence and a method that you pass each word and checks if its a number or not.

static boolean isNumber(String a){
    try{
        int x = Integer.parseInt(a);  
    }catch(NumberFormatException e){  
        return false; // if it attempts to parse an int from a String like "text" etc..
    }  
    return true;  // if int was successfully parsed 
}


public static void main(String[] args){

    String s = "12 text var2 14 8v 1"; 
    Scanner in = new Scanner(s);
    String result = "";

    while(in.hasNext()){ //scan every word
        String a = in.next();
            if(isNumber(a)) //check if number
                result += a + " "; //add only if its number
    }

    result = result.substring(0, result.length() - 1);//do this to remove the last " "(space) added inside the loop
}

System.out.println(result); will print: "12 14 1"

Upvotes: 0

Mena
Mena

Reputation: 48404

If you really want to use String.replaceAll for this, there's a workaround:

//            | one or more non-digits
//            |   | followed by one or more digits
//            |   |   | followed by one or more non-digits
//            |   |   |    | or the end of the input      
//            |   |   |    |     | replace with single white space
s.replaceAll("\\D+\\d+(\\D+|$)", " ");

Output

12 14 1

However, this solution is ugly and might break with different inputs.

I recommend you parse for positives instead, and gather by iterating over input.

Something in the lines of:

//                           | word boundary
//                           |  | one or more digits
//                           |  |    | word boundary
Pattern p = Pattern.compile("\\b\\d+\\b");

Upvotes: 2

Michał Schielmann
Michał Schielmann

Reputation: 1382

Try this regex:

([0-9]*[^0-9\s]+[0-9]*\s*)

Here all strings that have(or have not) digit at begining [0-9]* that is followed by one or more non-digit character [^0-9\s]+ and then have (or have not) a digit and space [0-9]*\s* are found. It will find all characters but numbers. It works for all kind of characters - also special characters.

Using it this way would result in what you need:

String myString = "12 text var2 14 8v 1";
myString = myString.replaceAll("([0-9]*[^0-9\\s]+[0-9]*\\s*)", "");
System.out.println(myString);

Output:

12 14 1

Upvotes: -1

Balicanta
Balicanta

Reputation: 109

Other Solution with Guava And Apache Common

String s = "12 text var2 14 8v 1";
Iterable<String> split = Splitter.on(CharMatcher.BREAKING_WHITESPACE).split(s);

for (String string : split) {
    boolean isNumber = StringUtils.isNumber(string);
    if(isNumber) {
        System.out.println(string);
    }
}

// Result -- 12 14 1

Upvotes: 0

T.J. Crowder
T.J. Crowder

Reputation: 1074266

The key here is word boundaries (\b). This seems to work:

String s = "x4 12 text var2 14 8v 1 1a";
s = s.replaceAll("\\b[\\d.]*[^ \\d.]+[\\d.]*\\b", "").replaceAll("  +", " ").trim();
System.out.println(s); // "12 14 1"

What that does is look for word boundaries on either side of anything that has at least one non-digit, non-decimal-point, non-space in it, and removes the entire match. You may need to add more that just spaces to the negated character class in the middle, depending on your input. Then I trim extraneous spaces.

Upvotes: 0

Related Questions