Reputation: 195
I have a dataframe like this:
df = pd.DataFrame({"id":[1, 1, 1, 2, 2, 2, 2, 3, 3], "val":["A12", "B23", "C34", "A12", "C34", "E45", "F56", "G67", "B23"]})
print(df)
id val
0 1 A12
1 1 B23
2 1 C34
3 2 A12
4 2 C34
5 2 E45
6 2 F56
7 3 G67
8 3 B23
How do I convert it to look like this?
id A12 B23 C34 E45 F56 G67
0 1 1 1 1 0 0 0
1 2 1 0 1 1 1 0
2 3 0 1 0 0 0 1
I tried pivot and unstack but since the number of values in the 'val' column can be different for each 'id', I'm not able to create a master list of columns and then somehow fill the values in those columns. Please help.
Upvotes: 2
Views: 67
Reputation: 195438
You can use pandas.get_dummies()
and then groupby()
:
print( pd.concat([df, pd.get_dummies(df['val'])], axis=1).groupby('id').sum().reset_index() )
Prints:
id A12 B23 C34 E45 F56 G67
0 1 1 1 1 0 0 0
1 2 1 0 1 1 1 0
2 3 0 1 0 0 0 1
Upvotes: 1
Reputation: 150745
Another way with get_dummies
and groupby
:
(pd.get_dummies(df.val)
.groupby(df['id'])
.sum()
.reset_index()
)
Or set_index
and get_dummies
:
(pd.get_dummies(df.set_index('id').val)
.sum(level='id')
.reset_index()
)
Output:
id A12 B23 C34 E45 F56 G67
0 1 1 1 1 0 0 0
1 2 1 0 1 1 1 0
2 3 0 1 0 0 0 1
Upvotes: 2