user1406626
user1406626

Reputation: 341

Set difference different using single character vs. string atoms?

I'm trying to wrap my head around why I get different results from multiple Pythons using set difference, depending on whether the set atoms are single character strings or multiples:

x = {'a': 1234, 'b': 2345, 'c': 9998}
y = set(x.keys())
w = {'aa': 1234, 'bb': 2345, 'cc': 9998}
z = set(w.keys())
print('c' in y)
print(y.difference('c'))
print(y.difference(set('c')))

print('cc' in z)
print(z.difference('cc'))
print(z.difference(set('cc')))

Produces:

True
set(['a', 'b'])
set(['a', 'b'])
True
set(['aa', 'cc', 'bb'])
set(['aa', 'cc', 'bb'])

I can't see why they should be different in behaviour.

Upvotes: 0

Views: 62

Answers (2)

ikostia
ikostia

Reputation: 7597

This happens because a.difference expects an iterable of elements to remove from a. Thus when you call y.difference('c') Python produces a set that is equivalent to y but does not contain 'c'. On the other hand, when you call z.difference('cc'), Python produces set that is equivalent to z but does not contain 'c' and does not contain 'c' (e.g. it iterates over the 'cc' string for elements to remove).

As @jonrsharpe mentioned, set('cc') is a set with one element, 'c', not with 'cc'. If you want a set with 'cc' as an element, you have to construct it in a way similar to: set(['cc']). If you run z.difference(set(['cc'])), you'll see what you expect.

Upvotes: 2

jorgeh
jorgeh

Reputation: 1767

If you read the docs, you'll see that the set() constructor takes an iterable. Since python strings are iterables, then set('x') is effectively the same as set('xxxxx'):

set('x') == set('xxxx') True # yes, they are the same

In your case you probably want set(['c']) and set(['cc']).

Upvotes: 2

Related Questions