Reputation: 3753
For example, I want to remove the duplicate characters like
hhhaaappy
to hhaappy
since h
and a
repeat twice.
I want to remove all the characters which repeat more than twice.
How to realize it in a fast way in python ?
Besides, is there any python module that can correct the word ?
like correct hhhaaappy
to happy
?
Upvotes: 2
Views: 1460
Reputation: 74
I'd thought it'll be cool to share this. Module called autocorrect.
It works by using a Candidate Model, by performing "simple edit" to the word. For example, it processes "deletion->remove a letter", "transposition->swap two adjacent letters", "replacement->change one letter to another", "insertion->add a letter".
Therefore, hhhaaappy
might not work but hhapy
or hhapppy
could work.
>>> from autocorrect import spell
>>> spell('hhhaaappy')
'hhhaaappy'
>>> spell('hhapy')
'shapy'
>>> spell('happpy')
'happy'
>>> spell('hhapppy')
'happy'
Upvotes: 3
Reputation: 71461
You can use itertools.groupby
:
import itertools
s = "hhhaaappy"
new_s = [(a, list(b)) for a, b in itertools.groupby(s)]
final_s = ''.join(''.join(b[:-1]) if len(b) > 2 else ''.join(b) for a, b in new_s)
Output:
'hhaappy'
Upvotes: 6