Reputation: 71

How to remove characters that appear more than once from a string?

So, I had a similar exercise on my IT classes: 'Print a string without characters appearing more than once (if they appear more than once, remove them)'. I thought that it was easy (and maybe it is), but I have completely no idea how to do that. I can do similar exercises (print all unique characters from a string / remove duplicates etc).

Example:

Input: '12345555555678'

Output: '1234678'

Upvotes: 0

Answers (5)

Aldas Žarnauskas

Reputation: 1

I solved a similar task on the codeacademy. I was requested to define a function that removes all vowels, even if it repeats. My code that allows to remove repeating symbols is below:

def anti_vowel(text):
    all_vowels = ["A", "E", "U", "I", "O", "a", "e", "o", "u", "i"]
    listed_text = []
    for letter in text:
        listed_text.append(letter)
    for vowel in all_vowels:
        while vowel in listed_text:
            listed_text.remove(vowel)
    return "".join(listed_text)
    
print(anti_vowel("Hey look Words!"))

output:

Hy lk Wrds!

Upvotes: 0

Akhil Sharma

Reputation: 175

i_str =  '12345555555678'
b = sorted(i_str)
for i in range(len(b)-1):
    if b[i] == b[i+1]:
        i_str = i_str.replace(b[i],'')

You just sort the string and compare each nth element with next element.If it is not same it is unique.

Also I am pretty sure it should be faster than using count function which will iterate though all the string for each unique element and check if the count of character is not greater than 1.

Upvotes: 1

Sorin

Reputation: 5395

basic algorithm for this is described in this answer- for each char you check if it appears more than once by counting it's occurrences in the string.

However that's fairly inefficient, since it goes trough the string n ^ 2. You can improve that with the expense of some memory (which is illustrated in this answer - but obfuscated by a library).

The algorithm would then be to go once trough the string and count the number of occurrences for each char and save them somewhere, then go again trough the string and print only the chars that have the count 1.

inp = '1345552225555678'

counts = {};

for ch in inp:
    if ch in counts:
        counts[ch] = counts[ch] + 1
    else:
        counts[ch] = 1

result = '';

for ch in inp:
    if counts[ch] == 1:
        result = result + ch

print result

Arguably, this would be O(n) since the access time for a dictionary is generally considered O(1) (see this question for a discussion)

Note: Usually this is done using an array the size of the number legal chars, but since strings in python are Unicode, an array would be huge, however the access time would be truly O(1);

Upvotes: 2

txemsukr

Reputation: 1037

This should look like what you want

input_str = 'ahuadvzudnioqdazvyduazdazdui'
for c in input_str:
    if input_str.count(c)==1:
        print(c)

It's easier to understand, but note that it has quite low performance (Complexity of O(n^2)).

To make it little faster you can use List Comprehension.

input_str = '12345555555678'
[x for x in input_str if input_str.count(x) == 1]

If order of the element doesn't matter to you the iterating over set of the list will be beneficial.

If you convert list into set using set(input_str) then it will have unique values which may evantually reduce search space.

Then you can apply list complrehension.

input_str = '12345555555678'
[x for x in set(input_str) if input_str.count(x) == 1]

Note: Do not forget the condition that order will not be preserved after converting to set.

Upvotes: 1

Filip Młynarski

Reputation: 3612

You could use collections.Counter().

from collections import Counter

inp = '12345555555678'
c = Counter(inp)
output = ''.join(k for k, v in c.items() if v == 1)  # -> 1234678

Simple implementation of Counter

c = {}
for char in inp:
    c[char] = c.get(char, 0) + 1

Upvotes: 1

How to remove characters that appear more than once from a string?

Answers (5)

Related Questions