Reputation: 2143
Hello StackOverflowers!
I have a pandas DataFrame
df = pd.DataFrame({
'A':[1,1,2,1,3,3,1,6,3,5,1],
'B':[10,10,300,10,30,40,20,10,30,45,20],
'C':[20,20,20,20,15,20,15,15,15,15,15],
'D':[10,20,30,40,80,10,20,50,30,10,70],
'E':[10,10,10,22,22,3,4,5,9,0,1]
})
Then I groupby it on some columns
groups = df.groupby(['A', 'B', 'C'])
I would like to select/filter the original data based on the groupby indices.
For example I would like to get 3 random combinations out of the groupby
Any ideas?
Upvotes: 1
Views: 155
Reputation: 88276
Instead of iterating along all groups len(indices)
times and indexing on the respective indices
value each time, get a list of the groups' keys
from the dictionary returned by GroupBy.groups
, and do single calls to GroupBy.get_group
for each index:
keys = list(groups.groups.keys())
# [(1, 10, 20), (1, 20, 15), (2, 300, 20)...
pd.concat([groups.get_group(keys[i]) for i in indices])
A B C D E
6 1 20 15 20 4
10 1 20 15 70 1
5 3 40 20 10 3
4 3 30 15 80 22
8 3 30 15 30 9
Upvotes: 3
Reputation: 2143
What I could do is
groups = df.groupby(['A', 'B', 'C'])
indices = [1, 4, 3]
pd.concat([[df_group for names, df_group in groups][i] for i in indices])
Which results to :
Out[24]:
A B C D E
6 1 20 15 20 4
10 1 20 15 70 1
5 3 40 20 10 3
4 3 30 15 80 22
8 3 30 15 30 9
I wonder if there is a more elegant way, maybe implemented already in the pd.groupby()?
Upvotes: 0