Reputation: 6325

Removing duplicate characters from a string

How can I remove duplicate characters from a string using Python? For example, let's say I have a string:

foo = 'mppmt'

How can I make the string:

foo = 'mpt'

NOTE: Order is not important

Upvotes: 73

Answers (16)

IndPythCoder

Reputation: 753

Using regular expressions:

import re
pattern = r'(.)\1+' # (.) any character repeated (\+) more than
repl = r'\1'        # replace it once
text = 'shhhhh!!!'
re.sub(pattern,repl,text)

output:

sh!

Upvotes: 3

Cary Swoveland

Reputation: 110665

You can replace matches of

rgx = r'(.)(?=.*\1)'

with empty strings.

import re

print(re.sub(rgx, '', 'abbcabdeeeafgfh'))
  #=> "cbdeagfh"

Demo

The regular expression matches any character (.), saves it to capture group 1 ((.)) and requires (by the use of the positive lookahead (?=.*\1)) that the same character (\1) appears later in the string.

In the example, the first and second 'a''s are matched, and therefore converted to empty strings, because in each case there is another 'a' later in the string. The third 'a' in the string is not matched because there are no 'a''s later in the string.

Upvotes: 0

Soudipta Dutta

Reputation: 2122

Create a list in Python and also a set which doesn't allow any duplicates. Solution1 :

def fix(string):
    s = set()
    list = []
    for ch in string:
        if ch not in s:
            s.add(ch)
            list.append(ch)
    
    return ''.join(list)        

string = "Protiijaayiiii"
print(fix(string))

Method 2 :

s = "Protijayi"

aa = [ ch  for i, ch in enumerate(s) if ch not in s[:i]]
print(''.join(aa))

Method 3 :

dd = ''.join(dict.fromkeys(a))
print(dd)

Upvotes: 6

Olivier_s_j

Reputation: 5182

Functional programming style while keeping order:

import functools

def get_unique_char(a, b):
    if b not in a:
        return a + b
    else:
        return a

if __name__ == '__main__':
    foo = 'mppmt'

    gen = functools.reduce(get_unique_char, foo)
    print(''.join(list(gen)))

Upvotes: 1

Tarish

Reputation: 616

d = {}
s="YOUR_DESIRED_STRING"
res=[]
for c in s:
    if c not in d:
      res.append(c)
      d[c]=1
print ("".join(res))

variable 'c' traverses through String 's' in the for loop and is checked if c is in a set d (which initially has no element) and if c is not in d, c is appended to the character array 'res' then the index c of set d is changed to 1. after the loop is exited i.e c finishes traversing through the string to store unique elements in set d, the resultant res which has all unique characters is printed.

Upvotes: 2

ListenSoftware Louise Ai Agent

Reputation: 4233

 mylist=["ABA", "CAA", "ADA"]
 results=[]
 for item in mylist:
     buffer=[]
     for char in item:
         if char not in buffer:
             buffer.append(char)
     results.append("".join(buffer))
    
 print(results)

 output
 ABA
 CAA
 ADA
 ['AB', 'CA', 'AD']

Upvotes: 0

hp_elite

Reputation: 188

#Check code and apply in your Program:

#Input= 'pppmm'

s = 'ppppmm'
s = ''.join(set(s))  
print(s)
#Output: pm

Upvotes: 3

Sven Marnach

Reputation: 601441

If order does not matter, you can use

"".join(set(foo))

set() will create a set of unique letters in the string, and "".join() will join the letters back to a string in arbitrary order.

If order does matter, you can use a dict instead of a set, which since Python 3.7 preserves the insertion order of the keys. (In the CPython implementation, this is already supported in Python 3.6 as an implementation detail.)

foo = "mppmt"
result = "".join(dict.fromkeys(foo))

resulting in the string "mpt". In earlier versions of Python, you can use collections.OrderedDict, which has been available starting from Python 2.7.

Upvotes: 158

swamy_teja7

Reputation: 1

from collections import OrderedDict
def remove_duplicates(value):
        m=list(OrderedDict.fromkeys(value))
        s=''
        for i in m:
            s+=i
        return s
print(remove_duplicates("11223445566666ababzzz@@@123#*#*"))

Upvotes: 0

hrnjan

Reputation: 383

As string is a list of characters, converting it to dictionary will remove all duplicates and will retain the order.

"".join(list(dict.fromkeys(foo)))

Upvotes: 1

Abhisek Meshram

Reputation: 1

def remove_duplicates(value):
    var=""
    for i in value:
        if i in value:
            if i in var:
                pass
            else:
                var=var+i
    return var

print(remove_duplicates("11223445566666ababzzz@@@123#*#*"))

Upvotes: 0

ravi tanwar

Reputation: 618

def dupe(str1):
    s=set(str1)

    return "".join(s)
str1='geeksforgeeks'
a=dupe(str1)
print(a)

works well if order is not important.

Upvotes: 2

Eugene Berezin

Reputation: 49

As was mentioned "".join(set(foo)) and collections.OrderedDict will do. A added foo = foo.lower() in case the string has upper and lower case characters and you need to remove ALL duplicates no matter if they're upper or lower characters.

from collections import OrderedDict
foo = "EugeneEhGhsnaWW"
foo = foo.lower()
print "".join(OrderedDict.fromkeys(foo))

prints eugnhsaw

Upvotes: 3

DSM

Reputation: 352999

If order does matter, how about:

>>> foo = 'mppmt'
>>> ''.join(sorted(set(foo), key=foo.index))
'mpt'

Upvotes: 47

Kevin Coffey

Reputation: 386

If order is important,

seen = set()
result = []
for c in foo:
    if c not in seen:
        result.append(c)
        seen.add(c)
result = ''.join(result)

Or to do it without sets:

result = []
for c in foo:
    if c not in result:
        result.append(c)
result = ''.join(result)

Upvotes: 2

kev

Reputation: 161614

If order is not the matter:

>>> foo='mppmt'
>>> ''.join(set(foo))
'pmt'

To keep the order:

>>> foo='mppmt'
>>> ''.join([j for i,j in enumerate(foo) if j not in foo[:i]])
'mpt'

Upvotes: 13

Removing duplicate characters from a string

Answers (16)

Related Questions