Animeartist
Animeartist

Reputation: 1187

How to perform in-place removal of duplicates from a string in Python?

I am trying to implement an inplace algorithm to remove duplicates from a string in Python.

str1 = "geeksforgeeks"
for i in range(len(str1)):
    for j in range(i+1,len(str1)-1):
         if str1[i] == str1[j]:  //Error Line
                      str1 = str1[0:j]+""+str1[j+1:]



print str1

In the above code, I am trying to replace the duplicate character with whitespace. But I get IndexError: string index out of range at if str1[i] == str1[j]. Am I missing out on something or is it not the right way?

My expected output is: geksfor

Upvotes: 1

Views: 115

Answers (4)

Olivier Melançon
Olivier Melançon

Reputation: 22324

Here is a simplified version of unique_everseen from itertools recipes.

from itertools import filterfalse

def unique_everseen(iterable)
    seen = set()
    see _ add = seen.add
    for element in filterfalse(seen.__contains__, iterable):
        seen_add(element)
        yield element

You can then use this generator with str.join to get the expected output.

str1 = "geeksforgeeks"
new_str1 = ''.join(unique_everseen(str1)) # 'geksfor'

Upvotes: 0

Daweo
Daweo

Reputation: 36640

As already pointed str is immutable, so in-place requirement make no sense. If you want to get desired output I would do it following way:

str1 = 'geeksforgeeks'
out = ''.join([i for inx,i in enumerate(str1) if str1.index(i)==inx])
print(out) #prints: geksfor

Here I used enumerate function to get numerated (inx) letters and fact that .index method of str, returns lowest possible index of element therefore str1.index('e') for given string is 1, not 2, not 9 and not 10.

Upvotes: 0

Jab
Jab

Reputation: 27515

You can do all of this with just a set and a comprehension. No need to complicate things.

str1 = "geeksforgeeks"

seen = set()
seen_add = seen.add
print(''.join(s for s in str1 if not (s in seen or seen_add(s))))
#geksfor

"Simple is better than complex."

~ See PEP20

Edit

While the above is more simple than your answer, it is the most performant way of removing duplicates from a collection the more simple solution would be to use:

from collections import OrderedDict
print("".join(OrderedDict.fromkeys(str1)))

Upvotes: 1

Adam Smith
Adam Smith

Reputation: 54223

It is impossible to modify strings in-place in Python, the same way that it's impossible to modify numbers in-place in Python.

a = "something"
b = 3

b += 1        # allocates a new integer, 4, and assigns it to b
a += " else"  # allocates a new string, " else", concatenates it to `a` to produce "something else"
              # then assigns it to a

Upvotes: 0

Related Questions