Abhishek Sengupta
Abhishek Sengupta

Reputation: 3301

Pandas column name coming in two levels

My initial Dataframe: initial DF

Hi, I was doing one assignment, after Dataframe melt operation followed by group by and count, I find some interesting thing when I did the later operations :

melted_df = df.melt(id_vars= 'cardio',value_vars=['cholesterol'     ,'gluc' ,   'smoke' ,   'alco' ,    'active' ,  'overweight'])


melted_df = pd.DataFrame(melted_df.groupby(['cardio' ,  'variable' ,    'value'])['value'].count())

After this operation the columns divided into two levels like this :

enter image description here

I am only able to rename the 1st level names, not the lower levels.

Can someone explain why there are two levels ?

Upvotes: 2

Views: 700

Answers (1)

Andreas
Andreas

Reputation: 9207

You need to set the as_index parameter.

pd.DataFrame(melted_df.groupby(['cardio' ,  'variable' ,    'value'], as_index=False)['value'].count())

The problem occurs because you put an existing dataframe as data into a new dataframe:

Is there a reason you don't do this?

melted_df = melted_df.groupby(['cardio' ,  'variable' ,    'value'], as_index=False)['value'].count()

Based on the request in your comment:

import pandas as pd
df = pd.read_csv(r"D:\test\medical_examination.csv")
df = df.melt(id_vars=['id', 'cardio'], value_vars=['cholesterol', 'gluc', 'smoke', 'alco', 'active'])
df = df.groupby(['cardio', 'variable', 'value'])['value'].agg(total=sum).reset_index()

Upvotes: 2

Related Questions