How to compute a column that depends on a period condition in pandas

Question

I am not sure what title I should give my question, but I have clear what I want to achieve.

I have the following dataframe:

period = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
final_renewal_percentage = [0.1, 0.2, 0.3, 0.4, 0.5, 0.5, 0.5, 0.5, 0.5,1]
first_renewals = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
df = pd.DataFrame({'period': period, 'first_renewals': first_renewals, 'final_renewal_percentage': final_renewal_percentage})

I need to compute the following column renewal_of_renewals:

0    0.0 # this is 0 since period < 4
1    0.0 # this is 0 since period < 4
2    0.0 # this is 0 since period < 4
3    0.0 # this is 0 since period < 4
4    0.5 # this is 1 * 0.5 (first_renewals corresponding to period=0)
5    1.0 # this is 2 * 0.5 (first_renewals corresponding to period=1)
6    1.5 # this is 3 * 0.5 (first_renewals corresponding to period=2)
7    2.0 # this is 4 * 0.5 (first_renewals corresponding to period=3)
8    2.5 # this is 5 * 0.5 (first_renewals corresponding to period=4)
9    6.0 # this is 6 * 1 (first_renewals corresponding to period=5)
Name: renewals_of_renewals, dtype: float64

Basically explaining, if period is < 4, renewals_of_renewals is 0. Otherwise, it is the product of first_renewals and final_renewal_percentage, but the value of first_renewals is the corresponding value to period - 4 (see details on dataframe)

I was able to compute this calculation by using a for loop. However, I wanna avoid using the for loop, but I have no idea how to achieve this.

derricw · Accepted Answer

I'd just do your calculation on the whole dataframe, then afterward set the zeros where you want them like this:

renewals_of_renewals = np.array(df['first_renewals'])[df['period']-4] * df['final_renewal_percentage']
renewals_of_renewals[np.where(df['period'] < 4)[0]] = 0.0

How to compute a column that depends on a period condition in pandas

Answers (2)

Related Questions