Reputation: 15010
I have a csv file as shown below
Hour,L,Dr,Tag,Code,Vge
0,L5,XI,PS,4R,15
0,L3,St,sst,4R,17
5,L5,XI,PS,4R,12
2,L0,St,v2T,4R,11
8,L2,TI,sst,4R,8
12,L5,XI,PS,4R,18
2,L2,St,PS,4R,9
12,L3,XI,sst,4R,16
I execute the following script in my ipython
notebook.
In[1]
import pandas as pd
In[2]
df = pd.read_csv('/python/concepts/pandas/in.csv')
In[3]
df.head(n=9)
Out[1]:
Hour L Dr Tag Code Vge
0 0 L5 XI PS 4R 15
1 0 L3 St sst 4R 17
2 5 L5 XI PS 4R 12
3 2 L0 St v2T 4R 11
4 8 L2 TI sst 4R 8
5 12 L5 XI PS 4R 18
6 2 L2 St PS 4R 9
7 12 L3 XI sst 4R 16
In[4]
df.groupby(('Hour'))['Vge'].head(n=9)
Out[2]
0 15
1 17
2 12
3 11
4 8
5 18
6 9
7 16
Name: Vge, dtype: int64
The output doesn't seem to be grouped by Hour
.Rather it looks like it is output in the order of dataframe
internal index.
I am trying to understand the groupby usage in Pandas dataframe.The usage hasn't cliked yet for me. It would be appreciated if someone could guide me.
Upvotes: 0
Views: 320
Reputation: 85622
You need to do something with the groups. For example:
>>> df.groupby('Hour').sum()
Vge
Hour
0 32
2 20
5 12
8 8
12 34
or:
>>> df.groupby('Hour').count()['Vge']
Hour
0 2
2 2
5 1
8 1
12 2
Name: Vge, dtype: int64
Upvotes: 1