Reputation:
I have extra spaces for example "- - - -" That I'm trying to remove... I tried using regex "\s+" as well as writing my own function.
System.out.println(test.removeExtraSpaces("- - - "));
System.out.println(test.removeExtraSpaces("- - - "));
and my results are
- - -
- - -
The first one I physically typed out the "spaces" with 3 of them in between each dash and the second one is from an import file. I think the problem I'm having is that they're not "real" spaces or a space with different unicode or something but I don't know how to remove them.
I started off using regex but that didn't work and I tried this which results in the image
public String removeExtraSpaces(String s){
s.trim();
String newString = "";
for(int i = 0; i < s.length() - 1; i++){
if(s.charAt(i) != ' '){
newString = newString + s.charAt(i);
}
else{
if(s.charAt(i + 1) != ' '){
newString = newString + s.charAt(i);
}
}
}
newString = newString + s.charAt(s.length()-1);
return newString.trim();
}
Here is the result https://i.sstatic.net/dUOKP.png
EDIT: People have been suggesting regex which I've already tried but here is the proof that regex does not work: https://i.sstatic.net/sC1om.png
Upvotes: 1
Views: 161
Reputation: 13960
\s+
only matches some of the Unicode whitespace characters. If you want to cover all of them, adapt your method to check for any of these characters instead of only spaces.
Upvotes: 0
Reputation: 124225
Character with codepoint 160
is non-breaking space which is not considered as whitespace so \\s
will not be able to match it. If you want to replace any kind of spaces (including non-breaking one) and any whitespaces (like tabulators \t
or line breaks \n
\r
) try with
replaceAll("[\\p{Zs}\\s]+"," ")
From http://www.regular-expressions.info/unicode.html
\p{Zs}
will match any kind of space character
Demo:
char[] arr = { 45, 32, 160, 32, 45, 32, 160, 32, 45, 32, 160 };
String str = new String(arr);
System.out.println("original: \"" + str + "\"");
str = str.replaceAll("[\\p{Zs}\\s]+", " ");
System.out.println("replaced: \"" + str + "\"");
Output:
original: "- - - "
replaced: "- - - "
Upvotes: 2