mackatozis
mackatozis

Reputation: 252

Regex to replace multiple spaces to single space excluding leading spaces

I'm trying to write a regex which will capture two or more whitespaces excluding leading whitespaces. Let's take the bellow example

One OS to rule them     all,
    One  OS  to  find    them.
    One     OS to call them    all,
    And  in  salvation    bind         them.
    In  the  bright  land  of  Linux,
    Where the     hackers play.

I want it to become

One OS to rule them all,
    One OS to find them.
    One OS to call them all,
    And in salvation bind them.
    In the bright land of Linux,
    Where the hackers play.

By using this regex ([ ]* ){2,} I can capture two or more whitespaces. The problem with this is that it also captures the leading whitespaces on lines 2 - 5.

Note: I want to use this regex inside Intellij IDEA.

Upvotes: 2

Views: 7046

Answers (3)

Federico Piazza
Federico Piazza

Reputation: 31035

You can use a regex like this:

\b\s+\b

With a space _ substitution

Working demo

enter image description here

Update for IntelliJ: seems the lookarounds aren't working on IntelliJ, so you can try this other workaround:

(\w+ )\s+

With replacement string: $1

Working demo

Of course, above regex will narrow the scenarios but you can try with that.

Upvotes: 3

Jan
Jan

Reputation: 43169

With a support for (*SKIP)(*FAIL) you could also come up with:

^[ ]+(*SKIP)(*FAIL)  # match spaces at the beginning of a line
                     # these shall fail
|                    # OR
[ ]{2,}              # at least two spaces

See a demo on regex101.com (mind the modifiers!).

Upvotes: 2

Aaron
Aaron

Reputation: 24812

In your example, you could use the word-boundary meta-character :

\b\s{2,}

That will match any number of spaces greater than 2 that follow the end of a word (or the beginning, but a word can't start with spaces).

However, it would fail in a more general case where you could have multiples spaces following a special character, which won't be considered part of a word.

If your language supports unbounded-width lookbehind, you can match the following :

(?<!^\s*)\s{2,}

Upvotes: 2

Related Questions