George Sotiropoulos
George Sotiropoulos

Reputation: 2143

Keep some groups from GroupBy using list of indices

Hello StackOverflowers!

I have a pandas DataFrame

df = pd.DataFrame({
    'A':[1,1,2,1,3,3,1,6,3,5,1],
    'B':[10,10,300,10,30,40,20,10,30,45,20],
    'C':[20,20,20,20,15,20,15,15,15,15,15],
    'D':[10,20,30,40,80,10,20,50,30,10,70],
    'E':[10,10,10,22,22,3,4,5,9,0,1]
})

Then I groupby it on some columns

groups = df.groupby(['A', 'B', 'C'])

I would like to select/filter the original data based on the groupby indices.

For example I would like to get 3 random combinations out of the groupby

Any ideas?

Upvotes: 1

Views: 155

Answers (2)

yatu
yatu

Reputation: 88276

Instead of iterating along all groups len(indices) times and indexing on the respective indices value each time, get a list of the groups' keys from the dictionary returned by GroupBy.groups, and do single calls to GroupBy.get_group for each index:

keys = list(groups.groups.keys())
# [(1, 10, 20), (1, 20, 15), (2, 300, 20)...
pd.concat([groups.get_group(keys[i]) for i in indices])

    A   B   C   D   E
6   1  20  15  20   4
10  1  20  15  70   1
5   3  40  20  10   3
4   3  30  15  80  22
8   3  30  15  30   9

Upvotes: 3

George Sotiropoulos
George Sotiropoulos

Reputation: 2143

What I could do is

groups = df.groupby(['A', 'B', 'C'])

indices = [1, 4, 3]
pd.concat([[df_group for names, df_group in groups][i] for i in indices])

Which results to :

Out[24]: 
    A   B   C   D   E
6   1  20  15  20   4
10  1  20  15  70   1
5   3  40  20  10   3
4   3  30  15  80  22
8   3  30  15  30   9

I wonder if there is a more elegant way, maybe implemented already in the pd.groupby()?

Upvotes: 0

Related Questions