Lee Quarella
Lee Quarella

Reputation: 4742

Regex to insert space in vim

I am a regex supernoob (just reading my first articles about them), and at the same time working towards stronger use of vim. I would like to use a regex to search for all instances of a colon : that are not followed by a space and insert one space between those colons and any character after them.

If I start with:

foo:bar

I would like to end with

foo: bar

I got as far as %s/:[a-z] but now I don't know what do for the next part of the %s statement.

Also, how do I change the :[a-z] statement to make sure it catches anything that is not a space?

Upvotes: 15

Views: 9778

Answers (4)

jamessan
jamessan

Reputation: 42767

:%s/:\(\S\)/: \1/g

\S matches any character that is not whitespace, but you need to remember what that non-whitespace character is. This is what the \(\) does. You can then refer to it using \1 in the replacement.

So you match a :, some non-whitespace character and then replace it with a :, a space, and the captured character.


Changing this to only modify the text when there's only one : is fairly straight forward. As others have suggested, using some of the zero-width assertions will be useful.

:%s/:\@!<:[^:[:space:]]\@=/: /g

  • :\@!< matches any non-:, including the start of the line. This is an important characteristic of the negative lookahead/lookbehind assertions. It's not requiring that there actually be a character, just that there isn't a :.

  • : matches the required colon.

  • [^:[:space:]] introduces a couple more regex concepts.

    • The outer [] is a collection. A collection is used to match any of the characters listed inside. However, a leading ^ negates that match. So, [abc123] will match a, b, c, 1, 2, or 3, but [^abc123] matches anything but those characters.

    • [:space:] is a character class. Character classes can only be used inside a collection. [:space:] means, unsurprisingly, any whitespace. In most implementations, it relates directly to the result of the C library's isspace function.

    Tying that all together, the collection means "match any character that is not a : or whitespace".

  • \@= is the positive lookahead assertion. It applies to the previous atom (in this case the collection) and means that the collection is required for the pattern to be a successful match, but will not be part of the text that is replaced.

So, whenever the pattern matches, we just replace the : with itself and a space.

Upvotes: 25

Sam Brinck
Sam Brinck

Reputation: 901

you probably want to use :[^ ] to mach everything except spaces. As mentioned by Matt this will cause your replace to replace the extra character.
There are several ways to avoid this, here are 2 that I find useful.
1) Surround the last part of the search term with parenthesis \(\), this allows you to reference that part of the search in your replace term with a /1.
Your final replace string should look like this:

%s/:\([^ ]\)/: \1/g

2) end the search term early with \ze This will means that the entire search term must be met for a match, but only the part before \ze will be higlighted / or replaced
Your final replace string should look like this:

%s/:\ze[^ ]/: /g

Upvotes: 5

sidyll
sidyll

Reputation: 59327

An interesting feature of Vim regex is the presence of \zs and \ze. Other engines might have them too, but they're not very common.

The purpose of \zs is to mark the start of the match, and \ze the end of it. For example:

ab\zsc

matches c, only if before you have ab. Similarly:

a\zebc

matches a only if you have bc after it. You can mix both:

a\zsb\zec

matches b only if in between a and c. You can also create zero-width matches, which are ideal for what you're trying to do:

:%s/:\zs\ze\S/ /

Your search has no size, only a position. And them you substitute that position by " ". By the way, \S means any character but white space ones.

:\zs\ze\S matches the position between a colon and something not a space.

Upvotes: 7

Karl Bielefeldt
Karl Bielefeldt

Reputation: 49148

You want to use a zero-width negative lookahead assertion, which is a fancy way of saying look for a character that's not a space, but don't include it in the match:

:%s/: \@!/: /g

The \@! is the negative lookahead.

Upvotes: 7

Related Questions