Reputation: 511
I have a dataframe:
Name Sub_Marks
0 Tom {'Maths': 30, 'English': 40, 'Science': 35}
1 Harry {'Maths': 35, 'English': 30, 'Science': 25}
2 Nick {'Maths': 32, 'English': 23, 'Science': 20}
I need to explode the dictionary into multiple rows. E.g.
Name Subject Marks
0 Tom Maths 30
1 Tom English 40
2 Tom Science 35
3 Harry Maths 35
4 Harry English 30
5 Harry Science 25
6 Nick Maths 32
7 Nick English 23
8 Nick Science 20
I know we can explode list in dataframe. Is there any way to do it with dictionary?
Upvotes: 7
Views: 8403
Reputation: 71689
We can create a list of tuples of Name
, Marks
and Subjects
by iterating over the values of dataframe inside a list comprehension, then we can create a new dataframe from this list of tuples
out = pd.DataFrame([(n, k, v) for (n, d) in df.values for k, v in d.items()])
out.columns = ['Name', 'Subject', 'Marks']
Alternative pandas
based approach
m = pd.DataFrame([*df['Sub_Marks']], df.index).stack()\
.rename_axis([None,'Subject']).reset_index(1, name='Marks')
out = df[['Name']].join(m)
>>> out
Name Subject Marks
0 Tom Maths 30
1 Tom English 40
2 Tom Science 35
3 Harry Maths 35
4 Harry English 30
5 Harry Science 25
6 Nick Maths 32
7 Nick English 23
8 Nick Science 20
Upvotes: 15
Reputation: 1377
You can extract only the values from the dictionaries
and then expand it to multiple columns like this:
data = {'Name' : ['Tom', "Harry", "Nick"], "Sub_Marks" : [{'Maths': 30, 'English': 40, 'Science': 35},{'Maths': 35, 'English': 42, 'Science': 31},{'Maths': 20, 'English': 14, 'Science': 65}]}
df = pd.DataFrame(data)
df[['Maths','English', 'Science']] = df['Sub_Marks'].apply(pd.Series)
df.drop(columns=['Sub_Marks'], inplace=True)
df = df.set_index('Name').stack().reset_index()
df.columns = ['Name', 'Subject', 'Marks']
Upvotes: 2
Reputation: 23217
You can use .apply()
with pd.Series()
to 'explode' the dictionary into columns and then use .melt()
to transform the columns into Subject
and Marks
columns, as follows:
(df.drop(columns='Sub_Marks')
.join(df.apply(lambda x: pd.Series(x['Sub_Marks']), axis=1))
.melt(id_vars='Name', value_vars=['Maths', 'English', 'Science'], var_name='Subject', value_name='Marks')
.sort_values('Name')
).reset_index(drop=True)
You can also use pd.DataFrame()
together with to_list()
to 'explode' the dictionary:
(df.join(pd.DataFrame(df.pop('Sub_Marks').to_list()))
.melt(id_vars='Name', value_vars=['Maths', 'English', 'Science'], var_name='Subject', value_name='Marks')
.sort_values('Name')
).reset_index(drop=True)
Output:
Name Subject Marks
0 Harry Maths 35
1 Harry English 30
2 Harry Science 25
3 Nick Maths 32
4 Nick English 23
5 Nick Science 20
6 Tom Maths 30
7 Tom English 40
8 Tom Science 35
Upvotes: 1
Reputation: 14949
Alternate method via explode
-
df['Marks'] = df['Sub_Marks'].apply(lambda x: x.values())
df = df.apply(pd.Series.explode).rename(columns = {'Sub_Marks': 'Subject'})
You might wanna use ast.literal_eval
first (if above method doesn't work) -
import ast
df['Sub_Marks'] = df['Sub_Marks'].apply(ast.literal_eval)
Upvotes: 0