Reputation:
One of Python object methods which don't return the modified object is the .add()
method of Python set()
. This prevents chaining multiple calls to the method:
S = set()
S = S.add('item1').add('item2').add('item3')
giving:
AttributeError:
'NoneType' object has no attribute 'add'
Why I tend to prefer usage of chaining .add()
s over usage of .update()
or union()
or the |
operator?
Because it is a clear self-explaining code which mimics spoken language and therefore best suited for private use by occasional programmers where readability of own code from the time perspective is the main issue to cope with.
A known to me work-around to make above chaining possible is to overwrite set methods. I have coded for this purpose the class chainOfSets. With this class I can write:
S = set()
S = chainOfSets(S).add('item1').add('item2').add('item3').get()
print(S) # gives: {'item1', 'item3', 'item2'}
My question is:
Is there a better approach to allow chaining of object methods which don't return the object they manipulate as using an own class (e.g. chainOfSets, chainOfLists, chainOfPandas, etc)?
Below the chainOfSets class with implemented +
operator:
class chainOfSets:
"""
Allows chaining (by dot syntax) else not chainable set() methods
and addition/subtraction of other sets.
Is doesn't support interaction of objects of this class itself as
this is considered to be out of scope of the purpose for which this
class was created.
"""
def __init__(s, sv=set()):
s.sv = sv
# ---
def add(s, itm):
s.sv.add(itm)
return s
def update(s, *itm):
s.sv.update(itm)
return s
def remove(s, itm): # key error if not in set
s.sv.remove(itm)
return s
def discard(s, itm): # remove if present, but no error if not
s.sv.discard(itm)
return s
def clear(s):
s.sv.clear()
return s
# ---
def intersection(s, p):
s.sv = s.sv.intersection(p)
return s
def union(s, p):
s.sv = s.sv.union(p)
return s
def __add__(s, itm):
if isinstance(itm, set):
s.sv = s.sv.union(itm)
else:
s.sv.update(itm)
return s
def difference(s,p):
s.sv = s.sv.difference(p)
return s
def __sub__(s, itm):
if isinstance(itm, set):
s.sv = s.sv - itm
else:
s.sv.difference(set(itm))
return s
def symmetric_difference(s,p):
# equivalent to: union - intersection
s.sv = s.sv.symmetric_difference(p)
return s
# ---
def len(s):
return len(s.sv)
def isdisjoint(s,p):
return s.sv.isdisjoint(p)
def issubset(s,p):
return s.sv.issubset(p)
def issuperset(s,p):
return s.sv.issuperset(p)
# ---
def get(s):
return s.sv
#:class chainOfSets(set)
print((chainOfSets(set([1,2,3]))+{5,6}-{1}).intersection({1,2,5}).get())
# gives {2,5}
Upvotes: 1
Views: 709
Reputation: 155584
You can make this work with a lot of effort. You shouldn't though. Python has pretty firm rules on methods of built-in types:
X
always returns an instance of X
, it is creating a new modified instance and leaving the original instance unchangedNone
(most common case) or something that is not (typically, aside from nested container cases) an instance of X
(seen with stuff like the pop
methods of set
and dict
)These rules exist in part because Guido van Rossum (the creator of Python) finds arbitrary method chaining ugly and unreadable:
I find the chaining form a threat to readability; it requires that the reader must be intimately familiar with each of the methods. The [line-per-call] form [of the example code] makes it clear that each of these calls acts on the same object, and so even if you don't know the class and its methods very well, you can understand that the second and third call are applied to
x
(and that all calls are made for their side-effects), and not to something else.
Experienced Python programmers come to rely on these rules. Your proposed class intentionally violates the rules, in an effort to make idioms from other languages work in Python. But there's no reason to do this. For simple stuff like chained add
s, just use update
/|=
or union
/|
(depending on whether you want to make a new set
or not):
S = set()
# In-place options:
S.update(('item1', 'item2', 'item3'))
# or
S |= {'item1', 'item2', 'item3'}
# Not-in-place options
S = S.union(('item1', 'item2', 'item3'))
# or
S = S | {'item1', 'item2', 'item3'}
All of those are perfectly simple, fast, and require no custom types.
In basically every case you'll encounter in the real world, where you truly want to chain multiple unrelated methods that can't be applied as a single bulk method as in this case, your proposed a class would save you a line or two (if you really insist on compacting it all into a single line, you can always separate calls with semicolons on the same line, or fake it as a single expression by making a tuple
from all the call results that begins or ends in the original object and indexing it so the expression evaluates to said object; it's no worse than what you're trying to do with a custom class). But it would be slower (simply wrapping as in your question adds some overhead; dynamic wrapping via __getattr__
as in your answer is much more expensive), uglier, and unidiomatic. Code gets read more often than it's written, and it's frequently read by people who are not you; chasing maximum succinctness at the expense of writing code that introduces unnecessary new types that violate the idioms of the language they're written in helps no one.
Upvotes: 3