ah bon
ah bon

Reputation: 10061

Transfer multiple columns string values to numbers in Pandas

I'm working at a data frame like this:

   id type1 type2 type3
0   1   dog   NaN   NaN
1   2   cat   NaN   NaN
2   3   dog   cat   NaN
3   4   cow   NaN   NaN
4   5   dog   NaN   NaN
5   6   cat   NaN   NaN
6   7   cat   dog   cow
7   8   dog   NaN   NaN

How can I transfer it to the following dataframe? Thank you.

   id  dog  cat  cow
0   1  1.0  NaN  NaN
1   2  NaN  1.0  NaN
2   3  1.0  1.0  NaN
3   4  NaN  NaN  1.0
4   5  1.0  NaN  NaN
5   6  NaN  1.0  NaN
6   7  1.0  1.0  1.0
7   8  1.0  NaN  NaN

Upvotes: 2

Views: 39

Answers (1)

jezrael
jezrael

Reputation: 863781

First filter ony type columns by DataFrame.filter, reshape by DataFrame.stack, so possible call Series.str.get_dummies. Then for 0/1 output use max by first level of MultiIndex and change 1 to NaNs by DataFrame.mask. Last add first column by DataFrame.join:

df1 = df.filter(like='type').stack().str.get_dummies().max(level=0).mask(lambda x: x == 0)

Or use get_dummies and max per columns names and last change 1 to NaNs:

df1 = (pd.get_dummies(df.filter(like='type'), prefix='', prefix_sep='')
         .max(level=0, axis=1)
         .mask(lambda x: x == 0))

df = df[['id']].join(df1)
print (df)
   id  cat  cow  dog
0   1  NaN  NaN  1.0
1   2  1.0  NaN  NaN
2   3  1.0  NaN  1.0
3   4  NaN  1.0  NaN
4   5  NaN  NaN  1.0
5   6  1.0  NaN  NaN
6   7  1.0  1.0  1.0
7   8  NaN  NaN  1.0

Upvotes: 4

Related Questions