Reputation: 8391
Is there a nice pythonic way of grouping a list into a list of lists where each of the inner lists contain only those elements that have the same projection, defined by the user as a function?
Example:
>>> x = [0, 1, 2, 3, 4, 5, 6, 7]
>>> groupby(x, projection=lambda e: e % 3)
[[0, 3, 6], [1, 4, 7], [2, 5]]
I don't care about the projection itself, just that if it is equal for some elements these must end up in the same sublist.
I'm basically looking for a python equivalent of the haskell function GHC.Exts.groupWith
:
Prelude> import GHC.Exts
Prelude GHC.Exts> groupWith (`mod` 3) [0..7]
[[0,3,6],[1,4,7],[2,5]]
Upvotes: 8
Views: 23924
Reputation: 31161
Here is one approach using compress
from itertools
:
from itertools import compress
import numpy as np
L = [i %3 for i in x]
[list(compress(x, np.array(L)==i)) for i in set(L)]
#[[0, 3, 6], [1, 4, 7], [2, 5]]
Upvotes: 1
Reputation: 1726
The itertools
module in the standard-library contains a groupby()
function that should do what you want.
Note that the input to groupby()
should be sorted by the group key to yield each group only once, but it's easy to use the same key function for sorting. So if your key function (projection) is looking at whether a number is even, it would look like this:
from itertools import groupby
x = [0, 1, 2, 3, 4, 5, 6, 7]
def projection(val):
return val % 3
x_sorted = sorted(x, key=projection)
x_grouped = [list(it) for k, it in groupby(x_sorted, projection)]
print(x_grouped)
[[0, 3, 6], [1, 4, 7], [2, 5]]
Note that while this version only uses standard Python features, if you are dealing with more than maybe 100.000 values, you should look into pandas (see @ayhan's answer)
Upvotes: 15
Reputation:
A pandas version would be like this:
import pandas as pd
x = [0, 1, 2, 3, 4, 5, 6, 7]
pd.Series(x).groupby(lambda t: t%3).groups
Out[13]: {0: [0, 3, 6], 1: [1, 4, 7], 2: [2, 5]}
Or
pd.Series(x).groupby(lambda t: t%3).groups.values()
Out[32]: dict_values([[0, 3, 6], [1, 4, 7], [2, 5]])
Upvotes: 3
Reputation: 36013
No need to sort.
from collections import defaultdict
def groupby(iterable, projection):
result = defaultdict(list)
for item in iterable:
result[projection(item)].append(item)
return result
x = [0, 1, 2, 3, 4, 5, 6, 7]
groups = groupby(x, projection=lambda e: e % 3)
print groups
print groups[0]
Output:
defaultdict(<type 'list'>, {0: [0, 3, 6], 1: [1, 4, 7], 2: [2, 5]})
[0, 3, 6]
Upvotes: 7