user1015266
user1015266

Reputation:

Python: What's the difference between set.difference and set.difference_update?

s.difference(t) returns a new set with no elements in t.

s.difference_update(t) returns an updated set with no elements in t.

What's the difference between these two set methods? Because the difference_update updates set s, what precautions should be taken to avoid receiving a result of None from this method?

In terms of speed, shouldn't set.difference_update be faster since you're only removing elements from set s instead of creating a new set like in set.difference()?

Upvotes: 7

Views: 9699

Answers (2)

ivan_pozdeev
ivan_pozdeev

Reputation: 36126

difference_update updates the set in place rather than create a new one.

>>> s={1,2,3,4,5}
>>> t={3,5}
>>> s.difference(t)
{1, 2, 4}
>>> s
{1, 2, 3, 4, 5}
>>> s.difference_update(t)
>>> s
{1, 2, 4}

Upvotes: 12

Raymond Hettinger
Raymond Hettinger

Reputation: 226764

Q. What's the difference between these two set methods?

A. The update version subtracts from an existing set, mutating it, and potentially leaving it smaller than it originally was. The non-update version produces a new set, leaving the originals unchanged.

Q. Because the difference_update updates set s, what precautions should be taken to avoid receiving a result of None from this method?

A. Mutating methods in Python generally return None as a way to indicate that they have mutated an object. The only "precaution" is to not assign the None result to a variable.

Q. In terms of speed, shouldn't set.difference_update be faster since you're only removing elements from set s instead of creating a new set like in set.difference()?

A. Yes, the algorithm of the update version simply discards values.

In contrast, the algorithm for the non-updating version depends on the size of the sets.

If the size of s is four or more times larger that t, the new set version first copies the main set and then discards values from it. So "s - t is implemented as n = s.copy(); n.difference_update(t)). That algorithm is used when s is much larger than t

Otherwise, the algorithm for the non-updating version is to create an empty new set n, loop over the elements of s and add them to n if they are not present in t.

Upvotes: 13

Related Questions