Reputation: 133
I am trying to solve the following problem:
I have a pretty long list of integer numbers in a given range, most of them contain numbers with repeated digits like in the example below.
[123456, 889756, 854123, 997886, 634178]
My goal is to remove the ones with repeated digits or get a new list with numbers having only distinct digits:
[123456, 854123, 634178]
Is there a nice way how to do this? Thank you very much in advance!
Upvotes: 2
Views: 1571
Reputation: 23955
When using PyPy, this seems about 2 to 7 times faster than the other ones:
def has_duplicates(n):
s = 0
while n:
d = n % 10
if (1<<d) & s:
return True
s |= 1<<d
n = n // 10
return False
def f(lst):
return [i for i in lst if not has_duplicates(i)]
Upvotes: 1
Reputation: 350147
You can use set
on the string representation of the number to see how many digits make it into the set. If that is the same number of digits as the original number has, it passes the test:
lst = [123456, 889756, 854123, 997886, 634178]
result = [n for n in lst if len(set(str(n))) == len(str(n))]
print(result)
As commented below, benchmarks confirm that it is advantageous to perform an inline assignment to a temporary variable:
result = [n for n in lst if len(set(s := str(n))) == len(s)]
Upvotes: 5
Reputation: 195418
Another solution, with re
:
import re
r = re.compile(r"(\d).*\1")
lst = [123456, 889756, 854123, 997886, 634178]
lst = [i for i in lst if not r.search(str(i))]
print(lst)
Prints:
[123456, 854123, 634178]
EDIT: Small benchmark:
from timeit import timeit
lst = [123456, 889756, 854123, 997886, 634178] * 10000
def re_method(lst):
r = re.compile(r"(\d).*\1")
return [i for i in lst if not r.search(str(i))]
def trincot1(lst):
return [n for n in lst if len(set(str(n))) == len(str(n))]
def trincot2(lst):
return [n for n in lst if len(set(s := str(n))) == len(s)]
def afaalgo(lst):
answer = []
for value in lst:
value = str(value)
new_list = []
for nums in value:
new_list.append(nums)
if sorted(list(set(new_list))) == sorted(new_list):
answer.append(int(value))
return answer
t1 = timeit(lambda: re_method(lst), number=10)
t2 = timeit(lambda: trincot1(lst), number=10)
t3 = timeit(lambda: trincot2(lst), number=10)
t4 = timeit(lambda: afaalgo(lst), number=10)
print(t1)
print(t2)
print(t3)
print(t4)
Prints on my machine (3700x/Python 3.8.5):
0.2806989410019014
0.33745980000821874
0.263871792005375
0.8039937680005096
So version with set()
and :=
is fastest in this case.
Upvotes: 2
Reputation: 688
lst = [123456, 889756, 854123, 997886, 634178]
answer = []
for value in lst:
value = str(value)
new_list = []
for nums in value:
new_list.append(nums)
if sorted(list(set(new_list))) == sorted(new_list):
answer.append(int(value))
print(answer)
Upvotes: 1
Reputation: 15685
Here is some pseudocode (not in Python):
newList <- []
for each number in oldList
if hasNoRepeatDigits(number)
newList.add(number)
endif
endfor
Upvotes: 1