user3880092
user3880092

Reputation:

Can't get rid of extra spaces in java

I have extra spaces for example "- - - -" That I'm trying to remove... I tried using regex "\s+" as well as writing my own function.

System.out.println(test.removeExtraSpaces("-   -   -  "));
System.out.println(test.removeExtraSpaces("-   -   -  "));

and my results are

- - -
-   -   -  

The first one I physically typed out the "spaces" with 3 of them in between each dash and the second one is from an import file. I think the problem I'm having is that they're not "real" spaces or a space with different unicode or something but I don't know how to remove them.

I started off using regex but that didn't work and I tried this which results in the image

public String removeExtraSpaces(String s){
    s.trim();
    String newString = "";

    for(int i = 0; i < s.length() - 1; i++){
        if(s.charAt(i) != ' '){
            newString = newString + s.charAt(i);
        }
        else{
            if(s.charAt(i + 1) != ' '){
                newString = newString + s.charAt(i);
            }
        }
    }
    newString = newString + s.charAt(s.length()-1);

    return newString.trim();
}

Here is the result https://i.sstatic.net/dUOKP.png

EDIT: People have been suggesting regex which I've already tried but here is the proof that regex does not work: https://i.sstatic.net/sC1om.png

Upvotes: 1

Views: 161

Answers (2)

Gabriel Negut
Gabriel Negut

Reputation: 13960

\s+ only matches some of the Unicode whitespace characters. If you want to cover all of them, adapt your method to check for any of these characters instead of only spaces.

Upvotes: 0

Pshemo
Pshemo

Reputation: 124225

Character with codepoint 160 is non-breaking space which is not considered as whitespace so \\s will not be able to match it. If you want to replace any kind of spaces (including non-breaking one) and any whitespaces (like tabulators \t or line breaks \n \r) try with

replaceAll("[\\p{Zs}\\s]+"," ")

From http://www.regular-expressions.info/unicode.html

\p{Zs} will match any kind of space character


Demo:

char[] arr = { 45, 32, 160, 32, 45, 32, 160, 32, 45, 32, 160 };
String str = new String(arr);
System.out.println("original: \"" + str + "\"");
str = str.replaceAll("[\\p{Zs}\\s]+", " ");
System.out.println("replaced: \"" + str + "\"");

Output:

original: "-   -   -  "
replaced: "- - - "

Upvotes: 2

Related Questions