Reputation: 1558
What is an efficient way to pad punctuation with whitespace?
input:
s = 'bla. bla? bla.bla! bla...'
desired output:
s = 'bla . bla ? bla . bla ! bla . . .'
Comments:
Upvotes: 17
Views: 20327
Reputation: 838306
You can use a regular expression to match the punctuation characters you are interested and surround them by spaces, then use a second step to collapse multiple spaces anywhere in the document:
s = 'bla. bla? bla.bla! bla...'
import re
s = re.sub('([.,!?()])', r' \1 ', s)
s = re.sub('\s{2,}', ' ', s)
print(s)
Result:
bla . bla ? bla . bla ! bla . . .
Upvotes: 31
Reputation: 547
If you use python3, use the maketrans() function.
import string
text = text.translate(str.maketrans({key: " {0} ".format(key) for key in string.punctuation}))
Upvotes: 8
Reputation: 138027
This will add exactly one space if one is not present, and will not ruin existing spaces or other white-space characters:
s = re.sub('(?<! )(?=[.,!?()])|(?<=[.,!?()])(?! )', r' ', s)
This works by finding a zero-width position between a punctuation and a non-space, and adding a space there.
Note that is does add a space on the beginning or end of the string, but it can be easily done by changing the look-arounds to (?<=[^ ])
and (?=[^ ])
.
See in in action: http://ideone.com/BRx7w
Upvotes: 5