priyanka
priyanka

Reputation: 21

Java regular expression to replace single letters with white spaces on either side with a space

Can anyone help me with a regular expression to replace all the single letters with spaces. Example:

 input: "this is a t f with u f array"
output: "this is       with     array".

my regular expression is replaceAll("(\\s+[a-z]\\s+)"," "); But its works as follows:

  input: "this is a t f with u f array"
 output: "this is   t f with   f array".

Upvotes: 2

Views: 2195

Answers (7)

Vitalii Fedorenko
Vitalii Fedorenko

Reputation: 114420

You can try word boundaries:

"this is a t f with u f array".replaceAll("\\b[a-z]\\b"," ")

Upvotes: 2

Helter Scelter
Helter Scelter

Reputation: 715

String a = "this is a t f with u f array";

a = a.replaceAll("(\s\p{Alpha}(?=\s))+((?=\s)\s)", " ");

Zero width positive lookahead followed by a match of the trailing space in a capture group produces what you're looking for:

this is with array

Upvotes: 0

M. Jessup
M. Jessup

Reputation: 8222

The problem occurs because of the way replaceAll works. What happens is after each time it replaces a section it starts looking after the section it matched, for example when your pattern runs you get the result

this is t with f array

What is happening internally is:

  1. match pattern against "this is a t f with u f array"
  2. match found at " t "
  3. replace with " ".
  4. Begin matching after last match ("f with u f array")
  5. Note "f " does not match because there is no leading space.

What you need use is a trick called "zero-width positive lookahead" If you use the pattern:

(\\s+[a-z](?=\\s))

The second space says "try to match, but don't actually count it as part of the match". So when the next match occurs it will be able to use that space as part of its match.

You will also need to replace with the empty string, since the trailing space is not removed i.e.

"this is a t f with u f array".replaceAll("(\\s+[a-z](?=\\s))","")

Upvotes: 6

Brian Roach
Brian Roach

Reputation: 76898

replaceAll("\\b[a-z]\\b", " ");

will output

this is       with     array

The problem is in how the replaceAll approaches things. \\s[a-z]\\s matches

" a "

then moves on to

"t f with u f array"

which causes it to miss the first t

Upvotes: 0

limc
limc

Reputation: 40168

You could use word boundary:-

    String s = "this is a t f with u f array";
    s = s.replaceAll("\\b\\w\\b\\s+", "");
    System.out.println(s); // this is with array

Upvotes: 0

JB Nizet
JB Nizet

Reputation: 691655

This one is working on your test :

(\s+[a-z](\s[a-z])*\s+)

Upvotes: 0

Axel
Axel

Reputation: 14149

Hm... maybe because when the " a " is found and replaced in "... a t f ..", the matcher looks at the following character, wich is 't' (the space is already consumed). But then again I'd expect the output to be "this is t with f array".

Try using replaceAll("((\s+[a-z])*\s+)"," ") instead. But it has the (unwanted?) side effect that any length of whitespace will be reduced to a single space.

Upvotes: 0

Related Questions