Implementation of pandas groupby - indexing and slicing

Question

It looks like I don't understand properly how pandas groupby works and how to work with it. Could someone please explain what am I doing wrong or how to approach my problem? I would like to extract some data based on different columns. See the example:

    Number Name  Param1    Param2    Param3
0        1    A       0  0.179264  0.565864
1        2    A       1  0.374258  0.985103
2        1    C       2  0.799988  0.855600
3        3    B       3  0.237612  0.290065
4        3    C       4  0.195463  0.232030
5        2    C       5  0.611886  0.712429
6        4    A       6  0.178465  0.056347
7        1    B       7  0.018789  0.393464
8        5    B       8  0.549566  0.457160
9        4    B       9  0.149801  0.590501
10       4    C      10  0.112857  0.327013
11       3    A      11  0.902660  0.670725
12       2    B      12  0.474427  0.104224
13       5    C      13  0.691259  0.620992
14       5    A      14  0.043179  0.028890

Then I want to do an operation involving two loops. Basically, in this example, I would like to print the parameters (as an array, but that's not the problem) [Param1, Param2,Param3] of each Name that refers to each Number.

So a desired result would look like: Number 1: [[0, 0.179264, 0.565864],[2, 0.799988, 0.855600],[7, 0.018789, 0.393464]] Number 2: [[1, 0.374258, 0.985103],[5, 0.611886, 0.712429],[12, 0.474427, 0.104224]] etc. (Then I want to plot them and use "Name" for labelling.

This is the code:

for n in example.groupby('Number'):
    for name in example['Name']:
        params = np.array(example.loc[n['Name']==name,['Param1','Param2','Param3']])
        print 'Group:', n
        print 'Params:
', params

But it seems like I can't use the dataframe's indexing for groupby object. This code produces TypeError: tuple indices must be integers, not str. There may be multiple errors that I have made by now by trying to figure it out, but it seems to be correct indexing and slicing the groupby object is the main issue.

jezrael · Accepted Answer

I am not sure if understand well, but try loop in groups:

for i, n in df.groupby('Number'):
    print (i)
    print (n[['Param1','Param2','Param3']])
    #for output as nested lists 
    #print (n[['Param1','Param2','Param3']].values.tolist())

1
   Param1    Param2    Param3
0       0  0.179264  0.565864
2       2  0.799988  0.855600
7       7  0.018789  0.393464
2
    Param1    Param2    Param3
1        1  0.374258  0.985103
5        5  0.611886  0.712429
12      12  0.474427  0.104224
3
    Param1    Param2    Param3
3        3  0.237612  0.290065
4        4  0.195463  0.232030
11      11  0.902660  0.670725
4
    Param1    Param2    Param3
6        6  0.178465  0.056347
9        9  0.149801  0.590501
10      10  0.112857  0.327013
5
    Param1    Param2    Param3
8        8  0.549566  0.457160
13      13  0.691259  0.620992
14      14  0.043179  0.028890

Implementation of pandas groupby - indexing and slicing

Answers (1)

Related Questions