Reputation: 249
So here's my problem:
I need to separate these punctuation items '], [, ?, !, (, ), ", ;, {, }' from whatever character they touch with a space. For example,
"Did he eat it (the bug)?" becomes: " Did he eat it ( the bug ) ? "
I can do something like:
re.search(r'[]?!()";{}', mytext)
But when the search finds a match, how do I reference the item that was matched so I can replace it with itself and a space? In pseudo-code:
replace(matched_punc, matched_punc + " ")
Or the space could come before if it's word-final, but I can hash that out later. Mostly I just need to figure out how to replace something with itself and a space.
Many thanks.
Upvotes: 0
Views: 190
Reputation: 149020
What about using re.sub
:
re.sub(r'([][?!()";{}])', r' \1 ', mytext)
Or, if you need to ensure that you don't get multiple spaces in a together, something like should work:
re.sub(r'(?<=\S)(?=[][?!()";{}])|(?<=[][?!()";{}])(?=\S)', ' ', mytext)
Note: Thanks to perreal for making this click for me.
Upvotes: 3
Reputation: 97948
An alternative us to use lookaround expressions to do insertion instead of substitution:
print re.sub(r'(?<=[][?!()"])(?=[^ ])', ' ',
re.sub(r'(?<=[^ ])(?=[\[\]?!()"])', ' ', mytext))
Prints:
Did he eat it ( the bug ) ?
Upvotes: 2
Reputation: 72885
You would reference it with groups, like so (using your code as an example):
match = re.search(r'[]?!()";{}', mytext)
if match:
replace(match.group(0), match.group(0) + " ")
You can find more information here.
Upvotes: 2