LivingstoneM
LivingstoneM

Reputation: 1088

Change current column names to rows and replace with other column names

Hello guys I have a messy data frame in which row values are appearing as column names. Now what I want to do is to change those rows values appearing as column names to be just rows and replace them with other column names. This is how the raw data frame looks like:

# dictionary of lists 
dict = {'Erick':["aparna", "pankaj", "sudhir", "Geeku"], 
        'MBA': ["MBA", "BCA", "M.Tech", "MBA"], 
        '80':[90, 40, 80, 98]} 

df = pd.DataFrame(dict) 

print(df)

Now I do want to change those column names to be a row and replace with new column names here is the expected output

# dictionary of lists 
dict = {'Name':["Erick","aparna", "pankaj", "sudhir", "Geeku"], 
        'Degree': ["MBA","MBA", "BCA", "M.Tech", "MBA"], 
        'Score':[80,90, 40, 80, 98]} 

df = pd.DataFrame(dict) 

print(df)

Please help

Upvotes: 1

Views: 937

Answers (3)

Pugio
Pugio

Reputation: 16

This is kind of a hyperspecific question, so I dought there is a very slick answer, but I tried anyway.

Considering you have another array with the keys in the right order you can do this:

dict = {'Erick':["aparna", "pankaj", "sudhir", "Geeku"], 
        'MBA': ["MBA", "BCA", "M.Tech", "MBA"], 
        '80':[90, 40, 80, 98]} 

new_dict_keys = ['Name', 'Degree', 'Score']
new_dict = {}

for i, key in enumerate(dict.keys()):
    try:
        dict[key].append(int(key))
    except Exception as e:
        dict[key].append(key)

    new_dict[new_dict_keys[i]] = dict[key]

print(new_dict)

Please keep in mind, that your new_dict_keys array needs to be in the right order for this to work. Otherwise, you could also do this:

new_dict = {'Name': 'Erick', 'Degree': 'MBA', 'Score': '80'}

for i, key in enumerate(new_dict.keys()):
    try:
        dict[new_dict[key]].append(int(new_dict[key]))
    except Exception as e:
        dict[new_dict[key]].append(new_dict[key])

    new_dict[key] = dict[new_dict[key]]

print(new_dict)

Both returns your desired output of:

{
'Name': ['aparna', 'pankaj', 'sudhir', 'Geeku', 'Erick'], 
'Degree': ['MBA', 'BCA', 'M.Tech', 'MBA', 'MBA'], 
'Score': [90, 40, 80, 98, 80]
}

It's up to you how you want to include the new Key Names.

One Last thing though: Don't you dict as your variable name, it's a python keyword and therefore should never be used as a variable name.

Upvotes: 0

jezrael
jezrael

Reputation: 862651

One idea is create 1 row DataFrame from columns and DataFrame.append original data:

df = df.columns.to_series().to_frame().T.append(df, ignore_index=True)
df.columns = ['Name','Degree','Score']
print(df)
     Name  Degree Score
0   Erick     MBA    80
1  aparna     MBA    90
2  pankaj     BCA    40
3  sudhir  M.Tech    80
4   Geeku     MBA    98

Or use setting with enlargement:

df.loc[-1] = df.columns
df = df.sort_index().reset_index(drop=True)
df.columns = ['Name','Degree','Score']
print(df)
     Name  Degree Score
0   Erick     MBA    80
1  aparna     MBA    90
2  pankaj     BCA    40
3  sudhir  M.Tech    80
4   Geeku     MBA    98

Or create DataFrame by constructor with rename columns by dictionary:

#change dict in DataFrame constructor and reset builtins for avoid
#TypeError: 'dict' object is not callable
import builtins
dict = builtins.dict

d = {'Erick':["aparna", "pankaj", "sudhir", "Geeku"], 
        'MBA': ["MBA", "BCA", "M.Tech", "MBA"], 
        '80':[90, 40, 80, 98]} 

df = pd.DataFrame(d) 


c = ['Name','Degree','Score']
df = pd.DataFrame([df.columns], columns=c).append(df.rename(columns=dict(zip(df.columns, c))), 
                  ignore_index=True)
print(df)
     Name  Degree Score
0   Erick     MBA    80
1  aparna     MBA    90
2  pankaj     BCA    40
3  sudhir  M.Tech    80
4   Geeku     MBA    98

Upvotes: 1

Ghanshyam Savaliya
Ghanshyam Savaliya

Reputation: 608

you can also try below code to get your desired output:

dict = {'Erick':["aparna", "pankaj", "sudhir", "Geeku"], 
        'MBA': ["MBA", "BCA", "M.Tech", "MBA"], 
        '80':[90, 40, 80, 98]}

df = (pd.DataFrame(dict).T.reset_index().T.reset_index()).drop(['index'],axis=1)
df.columns = ['Name','Degree','Score']
print(df)

Kindly let me know if this code works for you.

Upvotes: 0

Related Questions