PaoloCrosetto
PaoloCrosetto

Reputation: 732

pandas dataframe groupby and get arbitrary member of each group

This question is similar to this other question.

I have a pandas dataframe. I want to split it into groups, and select an arbitrary member of each group, defined elsewhere.

Example: I have a dataframe that can be divided in 6 groups of 4 observations each. I want to extract the observations according to:

selected = [0,3,2,3,1,3]

This is very similar to

df.groupy('groupvar').nth(n)

But, crucially, n varies for each group according to the selected list.

Thanks!

Upvotes: 0

Views: 1242

Answers (1)

FooBar
FooBar

Reputation: 16498

Typically everything that you do within groupby should be group independent. So, within any groupby.apply(), you will only get the group itself, not the context. An alternative is to compute the index value for the whole sample (following, index) out of the indices for the groups (here, selected). Note that the dataset is sorted by groups, which you need to do if you want to apply the following.

I use test, out of which I want to select selected:

In[231]: test
Out[231]: 
           score
  name          
0 A    -0.208392
1 A    -0.103659
2 A     1.645287
0 B     0.119709
1 B    -0.047639
2 B    -0.479155
0 C    -0.415372
1 C    -1.390416
2 C    -0.384158
3 C    -1.328278

selected = [0, 2, 1]
c = test.groupby(level=1).count()
In[242]: index = c.shift(1).cumsum().add(array([selected]).T, fill_value=0)
In[243]: index
Out[243]: 
      score
name       
A         0
B         5
C         4
In[255]: test.iloc[index.values[:,0]]
Out[255]: 
           score
  name          
0 A    -0.208392
2 B    -0.479155
1 C    -1.390416

Upvotes: 1

Related Questions