A_Elric
A_Elric

Reputation: 3568

Regex is eating too much stuff

So I recently opened a question and ended up solving it by using a regex. The regex I used essentially ate ALL my non-english characters.

Let me retry this:

I want to eat all non-keyboard characters that may exist in a string

the regex that I'm using is:

[^\\p{L}\\p{N}]

However this turns stuff like

10/10/2012 10:51:25 AM

into

10102012105125AM

Is there some way to easily exclude all alt-code characters from a string with replaceALL and leave keyboard characters like % / \ : and others intact?

Thanks!

Upvotes: 1

Views: 180

Answers (3)

Marko Topolnik
Marko Topolnik

Reputation: 200158

You probably want to save only the ASCII characters. The character range [ -~] will achieve that. If you also want whitespace chars, you can add them in: [ -~\s].

System.out.println(input.replaceAll("[^ -~\\s]+", ""));

Upvotes: 2

AlexR
AlexR

Reputation: 115328

What about \p{Print}? It matches all printable characters, that sounds like exactly what you need.

Upvotes: 0

jheddings
jheddings

Reputation: 27563

To remove all non-ASCII characters:

String mystring = <your_input_string>;
mystring.replaceAll("[^ -~\\s]+", "");

Upvotes: 1

Related Questions