Reputation: 1257
I have a text:
text = "the march' which 'cause those it's good ' way"
I need to remove all apostrophes in the text if they have space before and/or after them:
"the march which cause those it's good way"
I tried:
re.sub("(?<=\b)'[a-z](?=\b)", "", text)
and
re.sub("\s'w+", " ", text)
But neither way seems to work for me
Upvotes: 1
Views: 755
Reputation: 110725
Assuming you wish remove any extra spaces when a single quote surrounded by spaces is removed, you could use the following regular expression.
(?<= ) *' +|'(?= )|(?<= )'
import re
re.sub("(?<= ) *' +|'(?= )|(?<= )'", '', str)
Python's regex engine performs the following operations.
(?<= ) # The following match must be preceded by a space
* # match 0+ spaces
' # match a single paren
+ # match 1+ spaces
| # or
' # match a single paren
(?= ) # single paren must be followed by a space
| # or
(?<= ) # The following match must be preceded by a space
' # match a single paren
(?<= )
is a postive lookbehind; (?= )
is a postive lookahead.
Note that this causes problems with "Gus' gal" and "It 'twas the night before the big bowling match", where the single quotes should not be removed.
Upvotes: 1
Reputation: 2670
Maybe...
(\s'\s?|'\s)
Given:
"the march' which 'cause those it's good ' way"
Replace with: a space, i.e., " "
Output:
"the march which cause those it's good way"
Only 131 steps.
Demo: https://regex101.com/r/x04Vg1/1
Upvotes: 1
Reputation: 88266
You could get this done by contemplating the three different possibilities, and chaining them with |
taking care of the order:
re.sub(r"(\s\'\s)|(\s\')|(\'\s)", ' ', text)
# "the march which cause those it's good way"
See demo
(\s\'\s)|(\s\')|(\'\s)
1st Alternative (\s\'\s)
1st Capturing Group (\s\'\s)
\s
matches any whitespace character (equal to [\r\n\t\f\v ]
)
\'
matches the character ' literally (case sensitive)\s
matches any whitespace character (equal to [\r\n\t\f\v ]
)(\s\')
(\s\')
\s
matches any whitespace character (equal to [\r\n\t\f\v ]
)\'
matches the character ' literally (case sensitive)(\'\s)
(\'\s)
\'
matches the character ' literally (case sensitive)\s
matches any whitespace character (equal to [\r\n\t\f\v ]
)Upvotes: 1
Reputation: 131
You can use replace() method of string to achieve this. As below:
text = "the march' which 'cause those it's good ' way"
new_text = text.replace("' "," ").replace(" ' "," ")
Upvotes: 2