Pandas row function row iteration

Question

I have a dataframe and a maximum value:

max_factor = 20
df = pd.DataFrame({'a':[1,1,1,2,2,2,3,3,3,3],'b':np.random.randn(10)})
df
   a         b
0  1 -0.424957
1  1  1.893320
2  1  0.187929
3  2 -1.413340
4  2  1.737371
5  2  0.959317
6  3 -0.554445
7  3  0.100595
8  3 -0.176009
9  3  0.430475

I want to create another column 'c' and populate it with values that depend on values in 'a', 'b' and 'c'.

For example:

if value(a) == 1 or value(a) == 2:
    value(c) = 0
else:
    value(c) = value(b)/max_factor + value(c-1)

I have tried multiple ways to do this but am struggling. Do I have to iterate through each row or is there a faster way to do this?

EDIT: The actual function to generate the values in column 'c' is more complicated but this would be a great starting point.

Leb · Accepted Answer

If you have multiple conditions besides this example you can use apply:

def foo(row):
    if row['a'] == 1 or row['a'] == 2:
        global v
        v = 0
    else:
        v_old = v
        v = row['b']/20+v_old

    return v

df['c'] = df.apply(foo,axis=1)

   a         b         c
0  1  0.858951  0.000000
1  1  0.588102  0.000000
2  1  1.452569  0.000000
3  2  1.400972  0.000000
4  2 -0.921342  0.000000
5  2 -1.117748  0.000000
6  3  0.792742  0.039637
7  3  0.254630  0.052369
8  3  0.351391  0.069938
9  3  1.822267  0.161052

Pandas row function row iteration

Answers (1)

Related Questions