Travis Griggs
Travis Griggs

Reputation: 22262

More elegant way to avoid this hacky implementation of single line reduce for grouping a collection?

In my evolved answer to this question, I came up with a way to do a single line (well, single expression) reduce to create the results of groupby as defined by many other languages (Kotlin, ObjC, Swift, Smalltalk, at least).

My initial attempt looked like:

def keyFunc(value):
    return derivative_of_value

grouped = reduce(
    lambda accum, each: accum[keyFunc(each)].append(each),
    allValues,
    defaultdict(list))

As stated in my Aside/Tangent there, the problem is the lambda. A lambda is limited to a single expression. And for it to work in reduce, it must return a modified version of the accumulated argument.

So I came up with the following hack, using a tuple to move the dict reference from reduction to reduction, but also force the (ignored) side effect of updating the same dict:

from functools import reduce
grouped = reduce(
    lambda accum, each: (accum[0], accum[0][keyFunc(each)].append(each)),
    allValues,
    (defaultdict(list), None))[0]

The Question is... is there a better way? Given the constraint that I want to try and still use a single expression reduce without a bunch of helper functions.

(I recognize that sometimes the code is telling you something, but I'm interested in this case for the academic side of things)

Upvotes: 0

Views: 140

Answers (2)

Tim
Tim

Reputation: 3427

Not sure why you want to do this with reduce or even defaultdict. But there are one-line solutions using list/dict comprehension. For example, given

>>> from collections import defaultdict
>>> def func1(a):
...     return str(a)
>>> b = list(range(10)) + list(range(5))
>>> b
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4]

the following is a one-line solution with dict:

>>> {x: [y for y in b if func1(y) == x] for x in set([func1(z) for z in b])}
{'4': [4, 4], '5': [5], '2': [2, 2], '8': [8], '9': [9], '7': [7], '0': [0, 0], '3': [3, 3], '1': [1, 1], '6': [6]}

The solutions below get the job done, but are bad (as pointed out by @juanpa.arrivillaga in the comments) because you are creating a potentially very large list and then throw it away immediately. See Is it Pythonic to use list comprehensions for just side effects?

More one-line solutions (2-lines actually if you count the defaultdict(list) initialization line) using list comprehension.

For example, with defaultdict

>>> a = defaultdict(list)
>>> [a[func1(x)].append(x) for x in b]
[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]
>>> a
defaultdict(<class 'list'>, {'0': [0, 0], '1': [1, 1], '2': [2, 2], '3': [3, 3], '4': [4, 4], '5': [5], '6': [6], '7': [7], '8': [8], '9': [9]})

Or with normal dict

>>> c = {}
>>> [c[func1(x)].append(x) if c.get(func1(x)) else c.update({func1(x):[x]}) for x in b]
[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]
>>> c
{'0': [0, 0], '1': [1, 1], '2': [2, 2], '3': [3, 3], '4': [4, 4], '5': [5], '6': [6], '7': [7], '8': [8], '9': [9]}

1[juanpa.arrivillaga]

Upvotes: 0

Travis Griggs
Travis Griggs

Reputation: 22262

I posted this to the python mailing list and received two solutions that were the basic same trick, but were more elegant refinements:

grouped = reduce(
    lambda groups, each: groups[keyFunc(each)].append(each) or groups,
    allValues,
    defaultdict(list))

This uses an or to skip over the None return from the part that does the actual work. Yay for falsey.

grouped = reduce(
    lambda groups, each: (groups[keyFunc(each)].append(each), groups)[1],
    allValues,
    defaultdict(list))

This second one still uses the tuple, but isolates just to the point that its needed so it doesn't leak over the rest of the code.

Upvotes: 2

Related Questions