Reputation: 53
Im trying to add a space after a special character if there isn't one already in a string.
This is my code
import re
line='Hello there! This is Robert. Happy New Year!Are you surprised?Im not.'
for i in re.finditer(r"\?|!|\.", line):
if line[i.end()]!=' ':
line=line.replace(line[i.end()],line[i.end()]+' ')
Expected output:
"Hello there! This is Robert. Happy New Year! Are you surprised? Im not."
Output from my code:
"Hello t here! This is Robert . Happy New Year!A re you surprised? Im not ."
I still haven't figured out why it doesn't work.
Upvotes: 1
Views: 610
Reputation: 163362
In your pattern you use an alternation for 3 characters \?|!|\.
that could also be in a single character class [?!.]
What you can do is match either one of them, and assert a non whitespace char other than any of those characters after it, in case you have for example Hi!!
In the replacement you can use the full match using \g<0>
followed by a space.
[?!.](?=[^?!.\s])
The pattern matches
[?!.]
Match either !
.
?
(?=
Positive lookahead, assert what is directly to the right is
[^?!.\s]
Match a non whitespace char other than !
.
?
)
Close lookaheadSee a regex demo and a Python demo.
Example
import re
regex = r"[?!.](?=[^?!.\s])"
line = 'Hi!!Hello there! This is Robert. Happy New Year!Are you surprised?Im not.'
result = re.sub(regex, r'\g<0> ', line)
if result:
print(result)
Output
Hi!! Hello there! This is Robert. Happy New Year! Are you surprised? Im not.
Upvotes: 1
Reputation: 110685
You can use the following regular expression with re.sub
, with (zero-width) matches being replaced by one space:
(?<=[!?.])(?=\S)
(?<=[!?.])
is a negative lookbehind that asserts that the string position is preceded by one of the three characters in the given character class, and the positive lookahead (?=\S)
asserts that the current string position is followed by a character other than a whitespace.
Upvotes: 2
Reputation: 18611
Use
re.sub(r'([!?.])(?=\S)', r'\1 ', line)
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[!?.] any character of: '!', '?', '.'
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
\S non-whitespace (all but \n, \r, \t, \f,
and " ")
--------------------------------------------------------------------------------
) end of look-ahead
import re
line='Hello there! This is Robert. Happy New Year!Are you surprised?Im not.'
line = re.sub(r'([!?.])(?=\S)', r'\1 ', line)
Results: Hello there! This is Robert. Happy New Year! Are you surprised? Im not.
Upvotes: 2