Reputation: 1088
Hello guys I have a messy data frame in which row values are appearing as column names. Now what I want to do is to change those rows values appearing as column names to be just rows and replace them with other column names. This is how the raw data frame looks like:
# dictionary of lists
dict = {'Erick':["aparna", "pankaj", "sudhir", "Geeku"],
'MBA': ["MBA", "BCA", "M.Tech", "MBA"],
'80':[90, 40, 80, 98]}
df = pd.DataFrame(dict)
print(df)
Now I do want to change those column names to be a row and replace with new column names here is the expected output
# dictionary of lists
dict = {'Name':["Erick","aparna", "pankaj", "sudhir", "Geeku"],
'Degree': ["MBA","MBA", "BCA", "M.Tech", "MBA"],
'Score':[80,90, 40, 80, 98]}
df = pd.DataFrame(dict)
print(df)
Please help
Upvotes: 1
Views: 937
Reputation: 16
This is kind of a hyperspecific question, so I dought there is a very slick answer, but I tried anyway.
Considering you have another array with the keys in the right order you can do this:
dict = {'Erick':["aparna", "pankaj", "sudhir", "Geeku"],
'MBA': ["MBA", "BCA", "M.Tech", "MBA"],
'80':[90, 40, 80, 98]}
new_dict_keys = ['Name', 'Degree', 'Score']
new_dict = {}
for i, key in enumerate(dict.keys()):
try:
dict[key].append(int(key))
except Exception as e:
dict[key].append(key)
new_dict[new_dict_keys[i]] = dict[key]
print(new_dict)
Please keep in mind, that your new_dict_keys
array needs to be in the right order for this to work.
Otherwise, you could also do this:
new_dict = {'Name': 'Erick', 'Degree': 'MBA', 'Score': '80'}
for i, key in enumerate(new_dict.keys()):
try:
dict[new_dict[key]].append(int(new_dict[key]))
except Exception as e:
dict[new_dict[key]].append(new_dict[key])
new_dict[key] = dict[new_dict[key]]
print(new_dict)
Both returns your desired output of:
{
'Name': ['aparna', 'pankaj', 'sudhir', 'Geeku', 'Erick'],
'Degree': ['MBA', 'BCA', 'M.Tech', 'MBA', 'MBA'],
'Score': [90, 40, 80, 98, 80]
}
It's up to you how you want to include the new Key Names.
One Last thing though: Don't you dict
as your variable name, it's a python keyword and therefore should never be used as a variable name.
Upvotes: 0
Reputation: 862651
One idea is create 1 row DataFrame
from columns and DataFrame.append
original data:
df = df.columns.to_series().to_frame().T.append(df, ignore_index=True)
df.columns = ['Name','Degree','Score']
print(df)
Name Degree Score
0 Erick MBA 80
1 aparna MBA 90
2 pankaj BCA 40
3 sudhir M.Tech 80
4 Geeku MBA 98
Or use setting with enlargement
:
df.loc[-1] = df.columns
df = df.sort_index().reset_index(drop=True)
df.columns = ['Name','Degree','Score']
print(df)
Name Degree Score
0 Erick MBA 80
1 aparna MBA 90
2 pankaj BCA 40
3 sudhir M.Tech 80
4 Geeku MBA 98
Or create DataFrame
by constructor with rename
columns by dictionary:
#change dict in DataFrame constructor and reset builtins for avoid
#TypeError: 'dict' object is not callable
import builtins
dict = builtins.dict
d = {'Erick':["aparna", "pankaj", "sudhir", "Geeku"],
'MBA': ["MBA", "BCA", "M.Tech", "MBA"],
'80':[90, 40, 80, 98]}
df = pd.DataFrame(d)
c = ['Name','Degree','Score']
df = pd.DataFrame([df.columns], columns=c).append(df.rename(columns=dict(zip(df.columns, c))),
ignore_index=True)
print(df)
Name Degree Score
0 Erick MBA 80
1 aparna MBA 90
2 pankaj BCA 40
3 sudhir M.Tech 80
4 Geeku MBA 98
Upvotes: 1
Reputation: 608
you can also try below code to get your desired output:
dict = {'Erick':["aparna", "pankaj", "sudhir", "Geeku"],
'MBA': ["MBA", "BCA", "M.Tech", "MBA"],
'80':[90, 40, 80, 98]}
df = (pd.DataFrame(dict).T.reset_index().T.reset_index()).drop(['index'],axis=1)
df.columns = ['Name','Degree','Score']
print(df)
Kindly let me know if this code works for you.
Upvotes: 0