Reputation: 3
I am trying to create dummy variables in python in the pandas dataframe format. I have a variable called "Weight Group" and I want to transform the variables like so:
Before transformation:
Weight_Group
0 1
1 5
2 4
3 2
4 2
5 3
6 1
After transformation:
WD_1 WD_2 WD_3 WD_4 WD_5
0 1 0 0 0 0
1 1 1 1 1 1
2 1 1 1 1 0
3 1 1 0 0 0
4 1 1 0 0 0
5 1 1 1 0 0
6 1 0 0 0 0
I know that pandas has the get_dummies() function that creates dummy variables, but it doesn't give me the functionality that I want, where someone in weight group 3 has ones in the WG_1, WG_2, and WG_3 columns. I have a lot of data points so a fast method would be great. If anyone has any ideas on how I can implement this I would really appreciate it!
Upvotes: 0
Views: 640
Reputation: 9019
You can call pd.get_dummies()
and then replace your 0
tallies with NaN
and use bfill()
(plus a bit of extra cleanup for display):
pd.get_dummies(df['Weight_Group'], prefix='WD').replace(0,np.nan).bfill(axis=1).fillna(0).astype(int)
Yields:
WD_1 WD_2 WD_3 WD_4 WD_5
0 1 0 0 0 0
1 1 1 1 1 1
2 1 1 1 1 0
3 1 1 0 0 0
4 1 1 0 0 0
5 1 1 1 0 0
6 1 0 0 0 0
Upvotes: 3