Reputation: 3
I found some solutions in this forum, but they do not work quite the way I hoped. The following input data is used:
ALY1 ATH BOL BRA
ALY2 ATH BOL BRA
ALY3 ATH BOL BRA
ALY4 ATH BOL BRA
ALY5 BOL BOL BRA
ALY6 ATH BOL BRA BOL
I want to keep only lines 1,2,3, and 4. But not 5 and 6, as here duplicates occur. The following is what I used:
f_groups = open(args[1], "r")
f_idl_group = open(args[2], "w")
def allUnique(x):
seen = set()
return not any(i in seen or seen.add(i) for i in x)
for line in f_groups :
line_elements = line.split()
identifyers = line_elements[0:]
if allUnique(identifyers) :
print("all is well" + identifyers[0])
#write to file
Using the script as above, all lines pass, but using :
if not allUnique(identifyers) :
Then only lines 5 and 6 pass. The latter is what I would expect, but I want the opposite, only to pass lines 1, 2, 3, and 4. Which fails. Any help is appreciated. Thanks.
Upvotes: 0
Views: 59
Reputation: 24052
Try this:
def allUnique(x):
return len(x) == len(set(x))
This will return True
if all elements in list x
are unique, otherwise False
. set(x)
is a set of the elements of x
, with any duplicates removed. If it has the same element count as x
, then there were no duplicates. Otherwise there were.
Upvotes: 4