chappar
chappar

Reputation: 7505

Vim regular expression to remove all but last two digits of number

I have following text in a file

23456789

When I tried to replace the above text using command

1,$s/\(\d\)\(\d\d\d\)\(\d\d\)*\>/\3\g

I am getting 89. Shouldn't it be 6789? Can anyone tell me why it is 89.

Upvotes: 8

Views: 10410

Answers (5)

Gaston
Gaston

Reputation: 142

I have tried this one in nvi and it does not work. In vim it works, only that you must correct the final inverted dash before the g, for a dash, like this:

1,$s/\(\d\)\(\d\d\d\)\(\d\d\)*\>/\3/g

and it gets replaced with 89. The reason is that you are saying with the * that the last \d\d can be repeated zero, one or more times, and with > you are saying end word boundary. With the group 3 you are saying that you want the las group, but because of the * the las two digits (\d\d) are 89. Taking out the *> you can get 6789. Like this:

1,$s/\(\d\)\(\d\d\d\)\(\d\d\)/\3/g

Watch out for the > who is playing a tricky part because with this: :1,$s/\(\d\)\(\d\d\d\)\(\d\d\)\>/\3 you get 2389 LOL! Because from the end of word-boundary perspective dddddd is matching 456789 and it gets replaced with the last two dd, and that is 89. So you get 23+89 Mind blowing! LOL

Upvotes: 0

tddmonkey
tddmonkey

Reputation: 21184

Group 3 is defined as being 2 digits long. If you want to match the last 4 digits you want \(\d\d\d\d\) with no * at the end. If you just want to match all digits but the first 4, put your * inside the group match rather than outside.

Upvotes: 1

orip
orip

Reputation: 75427

You'd probably want (need an extra wrapping group):

%s/\(\d\)\(\d\d\d\)\(\(\d\d\)*\)\>/\3\g

Although I'm not sure why you're capturing the first 2 groups.

Upvotes: 0

Hasturkun
Hasturkun

Reputation: 36402

You want to use a non-capturing group here, like so

1,$s/\(\d\)\(\d\d\d\)\(\%(\d\d\)*\)\>/\3/g

which gives 6789 as the result here, and if input was changed to

2345678

would change the line to 278

Upvotes: 4

Dave Sherohman
Dave Sherohman

Reputation: 46187

As written, your regex captures one digit, then three digits, then any number of groups of two digits each. The third match will, therefore, always be two digits if it exists. In your particular test case, the '89' is in \4, not \3.

Changing the regex to

 1,$s/\(\d\)\(\d\d\d\)\(\d\d\+\)\>/\3\g

will give you '6789' as the result, since it will capture two or more digits (up to as many as are there) in the third group.

Upvotes: 4

Related Questions