Reputation: 880
I am very new a Python
I want to change sentence if there are repeated words.
Correct
Right now am I using this reg. but it do all so change on letters. Ex. "My friend and i is happy" --> "My friend and is happy" (it remove the "i" and space) ERROR
text = re.sub(r'(\w+)\1', r'\1', text) #remove duplicated words in row
How can I do the same change but instead of letters it have to check on words?
Upvotes: 7
Views: 10202
Reputation: 67
\b: Matches Word Boundaries
\w: Any word character
\1: Replaces the matches with the second word found
import re
def Remove_Duplicates(Test_string):
Pattern = r"\b(\w+)(?:\W\1\b)+"
return re.sub(Pattern, r"\1", Test_string, flags=re.IGNORECASE)
Test_string1 = "Good bye bye world world"
Test_string2 = "Ram went went to to his home"
Test_string3 = "Hello hello world world"
print(Remove_Duplicates(Test_string1))
print(Remove_Duplicates(Test_string2))
print(Remove_Duplicates(Test_string3))
Result:
Good bye world
Ram went to his home
Hello world
Upvotes: 0
Reputation: 22939
text = re.sub(r'\b(\w+)( \1\b)+', r'\1', text) #remove duplicated words in row
The \b
matches the empty string, but only at the beginning or end of a word.
Upvotes: 9
Reputation: 250881
Non- regex solution using itertools.groupby
:
>>> strs = "this is just is is"
>>> from itertools import groupby
>>> " ".join([k for k,v in groupby(strs.split())])
'this is just is'
>>> strs = "this just so so so nice"
>>> " ".join([k for k,v in groupby(strs.split())])
'this just so nice'
Upvotes: 9