Reputation: 6325
How can I remove duplicate characters from a string using Python? For example, let's say I have a string:
foo = 'mppmt'
How can I make the string:
foo = 'mpt'
NOTE: Order is not important
Upvotes: 73
Views: 226689
Reputation: 753
Using regular expressions:
import re
pattern = r'(.)\1+' # (.) any character repeated (\+) more than
repl = r'\1' # replace it once
text = 'shhhhh!!!'
re.sub(pattern,repl,text)
output:
sh!
Upvotes: 3
Reputation: 110665
You can replace matches of
rgx = r'(.)(?=.*\1)'
with empty strings.
import re
print(re.sub(rgx, '', 'abbcabdeeeafgfh'))
#=> "cbdeagfh"
The regular expression matches any character (.
), saves it to capture group 1 ((.)
) and requires (by the use of the positive lookahead (?=.*\1)
) that the same character (\1
) appears later in the string.
In the example, the first and second 'a'
's are matched, and therefore converted to empty strings, because in each case there is another 'a'
later in the string. The third 'a'
in the string is not matched because there are no 'a'
's later in the string.
Upvotes: 0
Reputation: 2122
Create a list in Python and also a set which doesn't allow any duplicates. Solution1 :
def fix(string):
s = set()
list = []
for ch in string:
if ch not in s:
s.add(ch)
list.append(ch)
return ''.join(list)
string = "Protiijaayiiii"
print(fix(string))
Method 2 :
s = "Protijayi"
aa = [ ch for i, ch in enumerate(s) if ch not in s[:i]]
print(''.join(aa))
Method 3 :
dd = ''.join(dict.fromkeys(a))
print(dd)
Upvotes: 6
Reputation: 5182
Functional programming style while keeping order:
import functools
def get_unique_char(a, b):
if b not in a:
return a + b
else:
return a
if __name__ == '__main__':
foo = 'mppmt'
gen = functools.reduce(get_unique_char, foo)
print(''.join(list(gen)))
Upvotes: 1
Reputation: 616
d = {}
s="YOUR_DESIRED_STRING"
res=[]
for c in s:
if c not in d:
res.append(c)
d[c]=1
print ("".join(res))
variable 'c' traverses through String 's' in the for loop and is checked if c is in a set d (which initially has no element) and if c is not in d, c is appended to the character array 'res' then the index c of set d is changed to 1. after the loop is exited i.e c finishes traversing through the string to store unique elements in set d, the resultant res which has all unique characters is printed.
Upvotes: 2
Reputation: 4233
mylist=["ABA", "CAA", "ADA"]
results=[]
for item in mylist:
buffer=[]
for char in item:
if char not in buffer:
buffer.append(char)
results.append("".join(buffer))
print(results)
output
ABA
CAA
ADA
['AB', 'CA', 'AD']
Upvotes: 0
Reputation: 188
#Check code and apply in your Program: #Input= 'pppmm'
s = 'ppppmm'
s = ''.join(set(s))
print(s)
#Output: pm
Upvotes: 3
Reputation: 601441
If order does not matter, you can use
"".join(set(foo))
set()
will create a set of unique letters in the string, and "".join()
will join the letters back to a string in arbitrary order.
If order does matter, you can use a dict
instead of a set, which since Python 3.7 preserves the insertion order of the keys. (In the CPython implementation, this is already supported in Python 3.6 as an implementation detail.)
foo = "mppmt"
result = "".join(dict.fromkeys(foo))
resulting in the string "mpt"
. In earlier versions of Python, you can use collections.OrderedDict
, which has been available starting from Python 2.7.
Upvotes: 158
Reputation: 1
from collections import OrderedDict
def remove_duplicates(value):
m=list(OrderedDict.fromkeys(value))
s=''
for i in m:
s+=i
return s
print(remove_duplicates("11223445566666ababzzz@@@123#*#*"))
Upvotes: 0
Reputation: 383
As string is a list of characters, converting it to dictionary will remove all duplicates and will retain the order.
"".join(list(dict.fromkeys(foo)))
Upvotes: 1
Reputation: 1
def remove_duplicates(value):
var=""
for i in value:
if i in value:
if i in var:
pass
else:
var=var+i
return var
print(remove_duplicates("11223445566666ababzzz@@@123#*#*"))
Upvotes: 0
Reputation: 618
def dupe(str1):
s=set(str1)
return "".join(s)
str1='geeksforgeeks'
a=dupe(str1)
print(a)
works well if order is not important.
Upvotes: 2
Reputation: 49
As was mentioned "".join(set(foo)) and collections.OrderedDict will do. A added foo = foo.lower() in case the string has upper and lower case characters and you need to remove ALL duplicates no matter if they're upper or lower characters.
from collections import OrderedDict
foo = "EugeneEhGhsnaWW"
foo = foo.lower()
print "".join(OrderedDict.fromkeys(foo))
prints eugnhsaw
Upvotes: 3
Reputation: 352999
If order does matter, how about:
>>> foo = 'mppmt'
>>> ''.join(sorted(set(foo), key=foo.index))
'mpt'
Upvotes: 47
Reputation: 386
If order is important,
seen = set()
result = []
for c in foo:
if c not in seen:
result.append(c)
seen.add(c)
result = ''.join(result)
Or to do it without sets:
result = []
for c in foo:
if c not in result:
result.append(c)
result = ''.join(result)
Upvotes: 2
Reputation: 161614
If order is not the matter:
>>> foo='mppmt'
>>> ''.join(set(foo))
'pmt'
To keep the order:
>>> foo='mppmt'
>>> ''.join([j for i,j in enumerate(foo) if j not in foo[:i]])
'mpt'
Upvotes: 13