Reputation: 40899
I have the following string, where the substring 2.5
was incorrectly formed: 'It costs 2. 5. That is a lot.'
How do I remove the space between the 2.
and the 5
?
I tried:
s = 'It costs 2. 5. That is a lot.'
s = s.replace('. ', '.')
print(s) # It costs 2.5.That is a lot.
However, that also remove the correctly-placed space between the 5.
and T
. I think I'm looking for a sed-style regex substitution variable, like s/\. \([0-9]\)/.\1/g
. How do I do that in Python?
Upvotes: 2
Views: 145
Reputation: 163277
In case the string after it can start with a digit, you can match the dot after the second digit as well.
If you don't want to match newlines in between, you can match all whitespace chars without a newline.
\b(\d+\.)[^\S\r\n]+(\d+\.)
Explanation
\b
A word boundary(\d+\.)
Capture group 1, match 1+ digits and a dot[^\S\r\n]+
Match 1+ whitespace chars without a newline(\d+\.)
Capture group 2, match 1+ digits and a following dotIn the replacement use group 1 and group 2.
For example
import re
s = ("It costs 2. 5. That is a lot.\n"
"It costs 2. 5 items, that is a lot.")
pattern = r"\b(\d+\.)[^\S\r\n]+(\d+\.)"
print(re.sub(pattern, r"\1\2", s))
Output
It costs 2.5. That is a lot.
It costs 2. 5 items, that is a lot.
Upvotes: 1
Reputation: 3553
How about this:
(?<=\d\.)\s+(?=\d)
As seen here at regex101.com
I'm making use of positive lookbehinds and lookaheads in regex, which tell the regex to match one or more spaces \s+
which are preceded by a digit and a period (given by (?<=\d\.)
), and followed by a digit (given by (?=\d)
)
Here's a link to learn more about lookaheads and lookbehinds. They're incredibly useful in so many problems, so I suggest you learn more about them.
import re
s = 'It costs 2. 5. That is a lot.'
s = re.sub(r"(?<=\d\.)\s+(?=\d)", "", s)
Upvotes: 2
Reputation: 59184
You can use a regex:
>>> import re
>>> re.sub("(\d+). (\d+)", "\g<1>.\g<2>", s)
'It costs 2.5. That is a lot.'
Upvotes: 1