NobleSiks
NobleSiks

Reputation: 369

Removing duplicate instances of a character using a regex expression using python

This is what I tried:

(.)(?=.*\1)

This removes all instances of duplicates and leaves only the last instance, ie.

telnet -> lnet

I want this result:

telnet -> teln

How do I do this? I tried looking behind, but that only accepts a fixed length as far as I know.

Need to find a REGEX for this. I know other methods to achieve this without regex

Upvotes: 3

Views: 74

Answers (2)

jwdasdk
jwdasdk

Reputation: 183

a little 'hack' would be ... to reverse the string before and after the lookahead

import re

expr = r'telnetrer'[::-1]
pr = re.sub(r'(.)(?=.*\1)', r'', expr)[::-1]

print(pr)

Output

>>> telnr

Upvotes: 1

vks
vks

Reputation: 67988

Pure regex solution is not possible.You can try with callback function though.

z=[]
def fun(matchobj):
    if matchobj.group(1) in z or matchobj.group(2) in z:
        return ''
    else:
        if matchobj.group(1):
             z.append(matchobj.group(1))
        else:
             z.append(matchobj.group(2))
        return z[-1]



x="telnet"
print re.sub(r"(.)(?=.*\1)|(.)", fun, x)

Upvotes: 1

Related Questions