Reputation: 2308
I want to remove a word from the sentence if the word starts with 4 or more repeating characters.
eg:
['aaaaaaa is really good', 'nott something great',
'ssssssssssssstackoverflow is a great community']
I need an output something like this: eg:
['is really good', 'nott something great', 'is a great community']
I tried something like this:
^(\S)\1{3,}
It does remove those repeating characters but not the word. Thanks
Upvotes: 0
Views: 433
Reputation: 371198
Add \S*\s
to the end of the pattern:
words = ['aaaaaaa is really good', 'nott something great','ssssssssssssstackoverflow is a great community']
newWords = [re.sub(r'^(\S)\1{3,}\S*\s', '', word) for word in words]
Output:
['is really good', 'nott something great', 'is a great community']
If the string may be composed of only one word, then make the final space optional, \s?
instead of \s
.
Upvotes: 2