Reputation: 3301
Hi, I was doing one assignment, after Dataframe melt operation followed by group by and count, I find some interesting thing when I did the later operations :
melted_df = df.melt(id_vars= 'cardio',value_vars=['cholesterol' ,'gluc' , 'smoke' , 'alco' , 'active' , 'overweight'])
melted_df = pd.DataFrame(melted_df.groupby(['cardio' , 'variable' , 'value'])['value'].count())
After this operation the columns divided into two levels like this :
I am only able to rename the 1st level names, not the lower levels.
Can someone explain why there are two levels ?
Upvotes: 2
Views: 700
Reputation: 9207
You need to set the as_index
parameter.
pd.DataFrame(melted_df.groupby(['cardio' , 'variable' , 'value'], as_index=False)['value'].count())
The problem occurs because you put an existing dataframe as data into a new dataframe:
Is there a reason you don't do this?
melted_df = melted_df.groupby(['cardio' , 'variable' , 'value'], as_index=False)['value'].count()
Based on the request in your comment:
import pandas as pd
df = pd.read_csv(r"D:\test\medical_examination.csv")
df = df.melt(id_vars=['id', 'cardio'], value_vars=['cholesterol', 'gluc', 'smoke', 'alco', 'active'])
df = df.groupby(['cardio', 'variable', 'value'])['value'].agg(total=sum).reset_index()
Upvotes: 2