Alfodr
Alfodr

Reputation: 37

Find a match but not if followed by a specific character

I want to match all repeating spaces in a each line of the text excluding spaces around a specified character.

For example, if I want to find repeating spaces before # in this text

first    second third
first second    third
first    second #third

first second    #third
first second #    third
#    first second third

I am expecting to match multiple spaces in first 3 lines but not to have matches in last 3.

I have tried this: /(\S\s{2,})(?!\s*#)/g but that is also matching repeating spaces after # How to write a regex for this case?

Upvotes: 0

Views: 62

Answers (2)

The fourth bird
The fourth bird

Reputation: 163632

You could match what you want to get out of the way, and keep in a group what you are looking for.

(^[^\S\n]+|[^\S\n]*#.*)|[^\S\n]{2,}

Explanation

  • ( Capture group 1 (to keep unmodified)
    • ^[^\S\n]+ Match 1+ spaces without newlines from the start of the string
    • | Or
    • [^\S\n]*#.* Match optional spaces, # and the rest of the line
  • ) Close group 1
  • | Or
  • [^\S\n]{2,} Match 2 or more spaces without newlines

See a regex demo.

There is no language tagged, but if for example you want to replace the repeating spaces in the first 3 lines with a single space, this is a JavaScript example where you check for group 1.

If it is present, the return it to keep that part unmodified, else replace the match with your replacement string.

Note that \s can also match newlines, therefore I have added dashes between the 2 parts for clarity and used [^\S\n]+ to match spaces without newlines.

const regex = /(^[^\S\n]+|[^\S\n]*#.*)|[^\S\n]{2,}/g;
const str = `first    second third
first second    third
first    second #third
----------------------------------------
    keep the indenting second #third     #fourth  b.
first second    #third
first second #    third these spaces  should not  match
#    first second third`;
console.log(str.replace(regex, (_, g1) => g1 || " "));

Upvotes: 2

lemon
lemon

Reputation: 15502

One possible solution with lookarounds:

(?<![#\s])\s\s+(?![\s#])

The pattern will match any 2+ spaces \s\s+, that are:

  • not preceeded by either space or # ((?<![#\s]))
  • not followed by either space or # ((?![\s#]))

Check the demo here.

Upvotes: 2

Related Questions