Add columns to pandas Dataframe on different conditions

Question

I have a pandas dataframe.

import pandas as pd
data = pd.DataFrame({
'a': [0,1,0,0,1,1,0,1], 
'b': [0,0,1,0,1,0,1,1], 
'c': [0,0,0,1,0,1,1,1],
'rate': [0,0.1,0.11,0.12,0.24,0.27,0.3,0.4]})

a,b,c are my channels and I am adding another column that shows the sum of row wise totals of these channels by writing:

data['total'] = data.a + data.b + data.c

data

    a   b   c   rate    total
1   1   0   0   0.10    1
2   0   1   0   0.11    1 
3   0   0   1   0.12    1
4   1   1   0   0.24    2
5   1   0   1   0.27    2
6   0   1   1   0.30    2
7   1   1   1   0.40    3

I would like to work on data where total = 1 and total = 2

reduced = data[(data.a == 1) & (data.total == 2)]
print(reduced)

     a  b  c  rate  total
  4  1  1  0  0.24      2
  5  1  0  1  0.27      2

I would like to add columns to this reduced dataframe that looks like this:

      a  b  c  rate  total  prob_a  prob_b  prob_c
   4  1  1  0  0.24      2     0.1    0.11    0
   5  1  0  1  0.27      2     0.1     0      0.12

In the first row of the reduced data-frame, prob_c is 0 since C is absent ( ABC => 110). In the second row of the reduced data-frame, prob_b is 0 since B is absent (ABC => 101)

where,

# Channel a alone occurs (ABC => 100)
prob_a = data['rate'][(data.a == 1) & (data.total == 1)]

# Channel b alone occurs (ABC => 010)
prob_b = data['rate'][(data.b == 1) & (data.total == 1)]

# Channel c alone occurs (ABC => 001)
prob_c = data['rate'][(data.c == 1) & (data.total == 1)]

I tried this:

reduced['prob_a'] = data['rate'][(data.a == 1) & (data.total == 1)]
reduced['prob_b'] = data['rate'][(data.b == 1) & (data.total == 1)]
reduced['prob_c'] = data['rate'][(data.c == 1) & (data.total == 1)]
print(reduced)

which results in this output:

   a  b  c  rate  total  prob_a  prob_b  prob_c
4  1  1  0  0.24      2     NaN     NaN     NaN
5  1  0  1  0.27      2     NaN     NaN     NaN

BENY · Accepted Answer

IIUC

reduced['prob_a'] = data.loc[(data.a == 1) & (data.total == 1),'rate'].values[0]
reduced['prob_b'] = data.loc[(data.b == 1) & (data.total == 1),'rate'].values[0]
reduced['prob_c'] = data.loc[(data.c == 1) & (data.total == 1),'rate'].values[0]

reduced[['prob_a','prob_b','prob_c']]=reduced[['prob_a','prob_b','prob_c']].mul(reduced[['a','b','c']].eq(1).values)
reduced
Out[698]: 
   a  b  c  rate  total  prob_a  prob_b  prob_c
4  1  1  0  0.24      2     0.1    0.11    0.00
5  1  0  1  0.27      2     0.1    0.00    0.12

Add columns to pandas Dataframe on different conditions

Answers (2)

Related Questions