pandas: map more than 2 columns to one column

Question

This is an updated version of this question, which dealt with mapping only two columns to a new column.

Now I have three columns that I want to map to a single new column using the same dictionary (and return 0 if there is no matching key in the dictionary).

>> codes = {'2':1,
            '31':1,
            '88':9,
            '99':9}

>> df[['driver_action1','driver_action2','driver_action3']].to_dict()    
{'driver_action1': {0: '1',
  1: '1',
  2: '77',
  3: '77',
  4: '1',
  5: '4',
  6: '2',
  7: '1',
  8: '77',
  9: '99'},
 'driver_action2': {0: '4',
  1: '99',
  2: '99',
  3: '99',
  4: '1',
  5: '2',
  6: '2',
  7: '99',
  8: '99',
  9: '99'},
 'driver_action3': {0: '4',
  1: '99',
  2: '99',
  3: '99',
  4: '1',
  5: '99',
  6: '99',
  7: '99',
  8: '31',
  9: '31'}}

Expected output:

  driver_action1 driver_action2 driver_action3  newcolumn
0              1              4              4          0
1              1             99             99          9
2             77             99             99          9
3             77             99             99          9
4              1              1              1          9
5              4              2             99          1
6              2              2             99          1
7              1             99             99          9
8             77             99             31          1
9             99             99             31          1

I am not sure how to do this with .applymap() or combine_first().

MaxU - stand with Ukraine · Accepted Answer

Try this:

In [174]: df['new'] = df.stack(dropna=False).map(codes).unstack() \
     ...:               .iloc[:, ::-1].ffill(axis=1) \
     ...:               .iloc[:, -1].fillna(0)
     ...:

In [175]: df
Out[175]:
  driver_action1 driver_action2 driver_action3  new
0              1              4              4  0.0
1              1             99             99  9.0
2             77             99             99  9.0
3             77             99             99  9.0
4              1              1              1  0.0
5              4              2             99  1.0
6              2              2             99  1.0
7              1             99             99  9.0
8             77             99             31  9.0
9             99             99             31  9.0

alternative solution:

df['new'] = df.stack(dropna=False).map(codes).unstack().T \
              .apply(lambda x: x[x.first_valid_index()]
                               if x.first_valid_index() else 0)

Explanation:

stack, map, unstack mapped values:

In [188]: df.stack(dropna=False).map(codes).unstack()
Out[188]:
   driver_action1  driver_action2  driver_action3
0             NaN             NaN             NaN
1             NaN             9.0             9.0
2             NaN             9.0             9.0
3             NaN             9.0             9.0
4             NaN             NaN             NaN
5             NaN             1.0             9.0
6             1.0             1.0             9.0
7             NaN             9.0             9.0
8             NaN             9.0             1.0
9             9.0             9.0             1.0

reverse columns order and apply forward fill along columns axis:

In [190]: df.stack(dropna=False).map(codes).unstack().iloc[:, ::-1].ffill(axis=1)
Out[190]:
   driver_action3  driver_action2  driver_action1
0             NaN             NaN             NaN
1             9.0             9.0             9.0
2             9.0             9.0             9.0
3             9.0             9.0             9.0
4             NaN             NaN             NaN
5             9.0             1.0             1.0
6             9.0             1.0             1.0
7             9.0             9.0             9.0
8             1.0             9.0             9.0
9             1.0             9.0             9.0

select last column and fill NaN's with 0:

In [191]: df.stack(dropna=False).map(codes).unstack().iloc[:, ::-1].ffill(axis=1).iloc[:, -1].fillna(0)
Out[191]:
0    0.0
1    9.0
2    9.0
3    9.0
4    0.0
5    1.0
6    1.0
7    9.0
8    9.0
9    9.0
Name: driver_action1, dtype: float64

pandas: map more than 2 columns to one column

Answers (1)

Related Questions