Reputation: 1135
I was seeking for a way to test whether a tuple has all distinct elements - so to say, it is a set and ended up with this quick and dirty solution.
def distinct ( tup):
n=0
for t in tup:
for k in tup:
#print t,k,n
if (t == k ):
n = n+1
if ( n != len(tup)):
return False
else :
return True
print distinct((1,3,2,10))
print distinct((3,3,4,2,7))
Any thinking error? Is there a builtin on tuples?
Upvotes: 4
Views: 1908
Reputation: 5599
a = (1,2,3,4,5,6,7,8,9)
b = (1,2,3,4,5,6,6,7,8,9)
def unique(tup):
i = 0
while i < len(tup):
if tup[i] in tup[i+1:]:
return False
i = i + 1
return True
unique(a)
True
unique(b)
False
Readable and works with hashable items. People seem adverse to "while". This also allows for "special cases" and filtering on attributes.
Upvotes: 0
Reputation: 64378
In the majority of the cases, where all of the items in the tuple are hashable and support comparison (using ==
operator) with each other, @sshashank124's solution is what you're after:
len(set(tup))==len(tup)
For the example you posted, i.e. a tuple of ints, that would do.
Else, if the items are not hashable, but do have order defined on them (support '==', '<' operators, etc.), the best you can do is sorting them (O(NlogN) worst case), and then look for adjacent equal elements:
sorted_tup = sorted(tup)
all( x!=y for x,y in zip(sorted_tup[:-1],sorted_tup[1:]) )
Else, if the items only support equality comparison (==
), the best you can do is the O(N^2) worst case algorithm, i.e. comparing every item to every other.
I would implement it this way, using itertools.combinations:
def distinct(tup):
for x,y in itertools.combinations(tup, 2):
if x == y:
return False
return True
Or as a one-liner:
all( x!=y for x,y in itertools.combinations(tup, 2) )
Upvotes: 2
Reputation: 16950
What about using early_exit:
This will be faster:
def distinct(tup):
s = set()
for x in tup:
if x in s:
return False
s.add(x)
return True
Ok: Here is the test and profiling with 1000 int:
#!/usr/bin/python
def distinct1(tup):
n=0
for t in tup:
for k in tup:
if (t == k ):
n = n+1
if ( n != len(tup)):
return False
else :
return True
def distinct2(tup):
return len(set(tup))==len(tup)
def distinct3(tup):
s = set()
for x in tup:
if x in s:
return False
s.add(x)
return True
import cProfile
from faker import Faker
from timeit import Timer
s = Faker()
func = [ distinct1, distinct2, distinct3 ]
tuples = tuple([ s.random_int(min=1, max=9999) for x in range(1000) ])
for fun in func:
t = Timer(lambda: fun(tuples))
print fun.__name__, cProfile.run('t.timeit(number=1000)')
Output: distinct 2 and distinct3 are almost the same.
distinct1 3011 function calls in 60.289 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 60.289 60.289 <string>:1(<module>)
1000 60.287 0.060 60.287 0.060 compare_tuple.py:3(distinct1)
1000 0.001 0.000 60.288 0.060 compare_tuple.py:34(<lambda>)
1 0.000 0.000 0.000 0.000 timeit.py:143(setup)
1 0.000 0.000 60.289 60.289 timeit.py:178(timeit)
1 0.001 0.001 60.289 60.289 timeit.py:96(inner)
1 0.000 0.000 0.000 0.000 {gc.disable}
1 0.000 0.000 0.000 0.000 {gc.enable}
1 0.000 0.000 0.000 0.000 {gc.isenabled}
1 0.000 0.000 0.000 0.000 {globals}
1000 0.000 0.000 0.000 0.000 {len}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2 0.000 0.000 0.000 0.000 {time.time}
None
distinct2 4011 function calls in 0.053 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.053 0.053 <string>:1(<module>)
1000 0.052 0.000 0.052 0.000 compare_tuple.py:14(distinct2)
1000 0.000 0.000 0.053 0.000 compare_tuple.py:34(<lambda>)
1 0.000 0.000 0.000 0.000 timeit.py:143(setup)
1 0.000 0.000 0.053 0.053 timeit.py:178(timeit)
1 0.000 0.000 0.053 0.053 timeit.py:96(inner)
1 0.000 0.000 0.000 0.000 {gc.disable}
1 0.000 0.000 0.000 0.000 {gc.enable}
1 0.000 0.000 0.000 0.000 {gc.isenabled}
1 0.000 0.000 0.000 0.000 {globals}
2000 0.000 0.000 0.000 0.000 {len}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2 0.000 0.000 0.000 0.000 {time.time}
None
distinct3 183011 function calls in 0.072 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.072 0.072 <string>:1(<module>)
1000 0.051 0.000 0.070 0.000 compare_tuple.py:17(distinct3)
1000 0.002 0.000 0.072 0.000 compare_tuple.py:34(<lambda>)
1 0.000 0.000 0.000 0.000 timeit.py:143(setup)
1 0.000 0.000 0.072 0.072 timeit.py:178(timeit)
1 0.000 0.000 0.072 0.072 timeit.py:96(inner)
1 0.000 0.000 0.000 0.000 {gc.disable}
1 0.000 0.000 0.000 0.000 {gc.enable}
1 0.000 0.000 0.000 0.000 {gc.isenabled}
1 0.000 0.000 0.000 0.000 {globals}
181000 0.019 0.000 0.019 0.000 {method 'add' of 'set' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
2 0.000 0.000 0.000 0.000 {time.time}
None
Upvotes: 1
Reputation: 32197
You can very easily do it as:
len(set(tup))==len(tup)
This creates a set
of tup
and checks if it is the same length as the original tup
. The only case in which they would have the same length is if all elements in tup
were unique
>>> a = (1,2,3)
>>> print len(set(a))==len(a)
True
>>> b = (1,2,2)
>>> print len(set(b))==len(b)
False
>>> c = (1,2,3,4,5,6,7,8,5)
>>> print len(set(c))==len(c)
False
Upvotes: 11