multigoodverse
multigoodverse

Reputation: 8072

Grouping data in Python with pandas yields a blank first row

I have this nice pandas dataframe:

enter image description here

And I want to group it by the column "0" (which represents the year) and calculate the mean of the other columns for each year. I do such thing with this code:

df.groupby(0)[2,3,4].mean()

And that successfully calculates the mean of every column. The problem here being the empty row that appears on top:

enter image description here

Upvotes: 0

Views: 977

Answers (1)

EdChum
EdChum

Reputation: 394101

That's just a display thing, the grouped column now becomes the index and this is just the way that it is displayed, you will notice here that even when you set pd.set_option('display.notebook_repr_html', False) you still get this line, it has no effect on operations on the goruped df:

In [30]:

df = pd.DataFrame({'a':np.random.randn(5), 'b':np.random.randn(5), 'c':np.arange(5)})
df
Out[30]:
          a         b  c
0  0.766706 -0.575700  0
1  0.594797 -0.966856  1
2  1.852405  1.003855  2
3 -0.919870 -1.089215  3
4 -0.647769 -0.541440  4
In [31]:

df.groupby('c')['a','b'].mean()
Out[31]:
          a         b
c                    
0  0.766706 -0.575700
1  0.594797 -0.966856
2  1.852405  1.003855
3 -0.919870 -1.089215
4 -0.647769 -0.541440

Technically speaking it has assigneed the name attribute:

In [32]:

df.groupby('c')['a','b'].mean().index.name
Out[32]:
'c'

by default there will be no name if it has not been assigned:

In [34]:

print(df.index.name)
None

Upvotes: 1

Related Questions