Anjith
Anjith

Reputation: 2308

Remove the word if a character repeats multiple times

I want to remove a word from the sentence if the word starts with 4 or more repeating characters.

eg: 
['aaaaaaa is really good', 'nott something great',
       'ssssssssssssstackoverflow is a great community']

I need an output something like this: eg:

['is really good', 'nott something great', 'is a great community']

I tried something like this:

^(\S)\1{3,}

It does remove those repeating characters but not the word. Thanks

Upvotes: 0

Views: 433

Answers (1)

CertainPerformance
CertainPerformance

Reputation: 371198

Add \S*\s to the end of the pattern:

words = ['aaaaaaa is really good', 'nott something great','ssssssssssssstackoverflow is a great community']
newWords = [re.sub(r'^(\S)\1{3,}\S*\s', '', word) for word in words]

Output:

['is really good', 'nott something great', 'is a great community']

If the string may be composed of only one word, then make the final space optional, \s? instead of \s.

Upvotes: 2

Related Questions