Reputation: 321
What is the best way to remove words in a string that start with numbers and contain periods in Python?
this_string = 'lorum3 ipsum 15.2.3.9.7 bar foo 1. v more text 46 2. here and even more text here v7.8.989'
If I use Regex:
re.sub('[0-9]*\.\w*', '', this_string)
The result will be:
'lorum3 ipsum bar foo v more text 46 here and even more text here v'
I'm expecting the word v7.8.989
not to be removed, since it's started with a letter.
It will be great if the removed words aren't adding the unneeded space. My Regex code above still adds space.
Upvotes: 2
Views: 553
Reputation: 147256
You can use this regex to match the strings you want to remove:
(?:^|\s)[0-9]+\.[0-9.]*(?=\s|$)
It matches:
(?:^|\s)
: beginning of string or whitespace[0-9]+
: at least one digit\.
: a period[0-9.]*
: some number of digits and periods(?=\s|$)
: a lookahead to assert end of string or whitespaceYou can then replace any matches with the empty string. In python
this_string = 'lorum3 ipsum 15.2.3.9.7 bar foo 1. v more text 46 2. here and even more text here v7.8.989 and also 1.2.3c as well'
result = re.sub(r'(?:^|\s)[0-9]+\.[0-9.]*(?=\s|$)', '', this_string)
Output:
lorum3 ipsum bar foo v more text 46 here and even more text here v7.8.989 and also 1.2.3c as well
Upvotes: 4
Reputation: 163632
If you can make use of a lookbehind, you can match the numbers and replace with an empty string:
(?<!\S)\d+\.[\d.]*(?!\S)
Explanation
(?<!\S)
Assert a whitespace boundary to the left\d+\.[\d.]*
Match 1+ digits, then a dot followed by optional digits or dots(?!\S)
Assert a whitespace boundary to the rightIf you want to match an optional leading whitespace char:
\s?(?<!\S)\d+\.[\d.]*(?!\S)
Upvotes: 2
Reputation: 26
You can try this regex:
(^|\s)\d[^\s]*\.+[^\s]*
This matches strings like '7.a.0.1' which contains letter extra.
Here is a demo.
Upvotes: 1
Reputation: 1098
If you don't want to use regex, you can also do it using simple string operations:
res = ''.join(['' if (e.startswith(('0','1','2','3','4','5','6','7','8','9')) and '.' in e) else e+' ' for e in this_string.split()])
Upvotes: 1