Reputation: 700
This is actually question about the semantics of set comprehensions, but I first need to explain the context. I'm trying to create a new set of tuples where the paired value in the touple is unique regardless of the order of the values in the pair. Simplifying my actual program, what I have is something like {(1, 2), (2, 1), (3, 4)}
and I'd like to get {(1, 2), (3, 4)}
I tried doing something like this:
oldSet = {(1, 2), (2, 1), (3, 4)}
newSet = set()
newSet = {(val1, val2) for (val1, val2) in oldSet if not (val2, val1) in newSet}
However, newSet
is {(1, 2), (2, 1), (3, 4)}
, implying that something is wrong with my conditional expression. My understanding of comprehensions suggests that the above is syntactic sugar for something like this:
newSet = set()
for (val1, val2) in oldSet:
if not (val2, val1) in newSet:
newSet.add((val1, val2))
This traditional looping structure works (newSet
is {(1, 2), (3, 4)}
). Is there something about comprehensions that causes the conditional to be evaluated before newSet
has any members? I'm fairly new to Python, so I'm wondering if there's something subtle I'm missing.
Thanks!
Upvotes: 2
Views: 92
Reputation: 1125398
You misunderstood; the set comprehension is a distinct expression, separate from the assignment. The expression produces a new set()
object that then will be assigned to newSet
, replacing the old set()
object you had.
As such, as you iterate and build the set, the previous and separate set()
object bound to newSet
remains empty. In reality, the set comprehension does this:
newSet = set()
_result = set()
for (val1, val2) in oldSet:
if not (val2, val1) in newSet:
result.add((val1, val2))
newSet = _result
You could use side effects to alter a separate set as you iterate:
seen = set()
newSet = {(val1, val2) for (val1, val2) in oldSet
if not ((val2, val1) in seen or seen.add((val1, val2))}
This uses seen
to track what has already been processed and includes a tuple if both conditions are true:
seen.add()
operation for the tuple returns a false value. Since seen.add()
always returns None
, that is always going to be the case.Note that this now builds the same set twice, so you may as well do a regular loop and be done with it:
newSet = set()
for (val1, val2) in oldSet:
if not (val2, val1) in newSet:
newSet.add((val1, val2))
Since your tuples are consist of two values only, you may as well use sorting here; any pair of tuples (a, b), (b, a)
has one unique sorting, after all:
newSet = {tuple(sorted(t)) for t in oldSet}
Upvotes: 5
Reputation: 11606
A working alternative is:
newSet = { tuple(sorted(t)) for t in oldSet }
Your solution is checking the presence of the tuple in the set that is being generated, but the name is not bound to the value yet. It will when the comprehension terminates.
Upvotes: 2