Python altering column based on other values

Question

I have this dataframe:

x = pd.DataFrame({'colA':['A','A','A','B','C','C'], 'colB':['X','nm','X','nm','nm','nm']})

x
Out[254]: 
  colA colB
0    A    X
1    A   nm
2    A    X
3    B   nm
4    C   nm
5    C   nm

I want to replace values on column B in the following:

For each unique value of column A, if colB contains value X, then replace all colB values with "X", i.e. replace all values 'nm' with X for each group of values of column A.

If a group in column A (e.g. values 'C' in this example) do not contain a value of 'X' in column B then just leave 'nm' alone.

The result should be:

Out[254]: 
  colA colB
0    A    X
1    A    X
2    A    X
3    B   nm
4    C   nm
5    C   nm

I have attempted to do this using group by's and counting the number of "X" values which appear in each unique value in column A but I feel it's very convoluted. Hoping there's an easier way.

user2285236 · Accepted Answer

You can do it with groupby.transform:

x.groupby('colA')['colB'].transform(lambda col: 'X' if 'X' in col.values else 'nm')
Out: 
0     X
1     X
2     X
3    nm
4    nm
5    nm
Name: colB, dtype: object

Assigning it back:

x['colB'] = x.groupby('colA')['colB'].transform(lambda col: 'X' if 'X' in col.values else 'nm')

x
Out: 
  colA colB
0    A    X
1    A    X
2    A    X
3    B   nm
4    C   nm
5    C   nm

Python altering column based on other values

Answers (1)

Related Questions