Reputation: 2358
I have written a script to remove all unique elements from a list and print the list with only repeated elements:
Below are some examples how the output list for an input list should be
Input list1:
1,2,1,1,3,5,3,4,3,1,6,7,8,5
Output List1:
1,1,1,3,5,3,3,1,5
Input list2:
1,2,1,1,3,3,4,3,1,6,5
Output List2:
1,1,1,3,3,3,1
#! /bin/python
def remove_unique(*n):
dict1={}
list1=[]
for i in range(len(n)):
for j in range(i+1,len(n)):
if n[i] == n[j]:
dict1[j]=n[j]
dict1[i]=n[i]
for x in range(len(n)):
if x in dict1.keys():
list1.append(dict1[x])
return list1
lst1=remove_unique(1,2,1,1,3,5,3,4,3,1,6,7,8,5)
for n in lst1:
print(n, end=" ")
The script above works exactly as expected when tested with few smaller lists. However I want some ideas on how to optimize the script (both time and space complexities considered) for input lists with bigger lengths ( 50000 <=len(list) <= 50M )
Upvotes: 1
Views: 125
Reputation: 140168
your script has a number of issues:
if x in dict1.keys()
=> if x in dict1
to be sure to use the dictionary check instead of linearappend
in a loop, not as performant.O(n^2)
complexity because of the double loopMy approach:
You could count your elements using collections.Counter
, then filter out a new list using a list comprehension using a filter on the number of ocurrences:
from collections import Counter
list1 = [1,2,1,1,3,5,3,4,3,1,6,7,8,5]
c = Counter(list1)
new_list1 = [k for k in list1 if c[k]>1]
print(new_list1)
result:
[1, 1, 1, 3, 5, 3, 3, 1, 5]
I may be wrong but, the complexity of this approach is (roughly) O(n*log(n))
(linear scan of the list plus the hashing of the keys in the dictionary and the lookup in the list comprehension). So, it's good performance-wise.
Upvotes: 3