Replaceing dots between characters

So I want to substitude dots in string when there is no space after of before the dots. I have thought this could be easily done with a regular expression but I haven't been able to do it.

I have patterns and I want them to be:

I have tried the following patterns:

\w+(\.)+\w+
\w+(\.+\w+)
\w+\.+\w+

I always get something like: he.ll.o wo.rl.d

I am using python's re module to match and replace with the following code:

>>> re.sub(r'\w+\.+\w+', lambda x: x.group(0).replace('.', ''), 'h.e.ll.o w.o.r.l.d')
'he.llo wo.rl.d'

Upvotes: 3

Views: 573

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627128

In all your patterns you consume a char after the dot, so there is no chance to match it in the next iteration with the first \w+ (as it must consume at least 1 word char).

To fix your approach, you may match 1+ word chars followed with 1+ repetitions of . followed with 1+ word chars:

re.sub(r'\w+(?:\.+\w+)*', lambda x: x.group(0).replace('.', ''), s)

Here is the Python demo.

Another approach to remove . between word chars is

re.sub(r'\b\.\b', '', s)

See this regex demo. Here, . is only matched in case it is within word chars.

Alternatively, you may use this approach to match any . not enclosed with whitespace:

re.sub(r'(?<!\s)\.(?!\s)', '', 'h.e.ll.o w.o.r.l.d')

See the Python demo and the regex demo.

Details

  • (?<!\s) - a negative lookbehind that fails the match if there is a whitespace immediately to the left of the current location
  • \. - a dot
  • (?!\s) - a negative lookahead that fails the match if there is a whitespace immediately to the right of the current location.

Upvotes: 12

eddey
eddey

Reputation: 31

This would be my approach.

re.sub(r'\.(?=\w)', '', 'h.e.ll.o. w.o.r.l.d')

  • \. a dot
  • (?=\w) Look ahead: Checks if there is \w after the dot.

Upvotes: 0

Related Questions