Reputation: 51
original data is:
df=pd.DataFrame({'A': [1]*3 + [2]*3 + [1]*4 + [3]*5,
'B': [1.5]*2 + [2]*4 + [1.5]*5 + [3.2]*4})
how can I convert column A
and B
to:
and rule of A_con
and B_con
is:
A_con[0]=0; if(A[i]>A[i-1], A_con[i]=True, if(A[i]==A[i-1] && A_con[i-1]==True, A_con[i]=True, A_con[i]=False))
rule of days_A and days_B is:
if(A_con[i]==True, days_A[i]=days_A[i-1]+1, days_A[i]=0)
Upvotes: 3
Views: 79
Reputation: 62037
The diff
method takes the difference between the current row and the row above it. Anything positive will make A_con
True. The tricky part is when the difference is 0. When 0, the immediate above value can take it's place. This is done using the replace
with the ffill
method. That takes care of A_con
and B_con
For the days columns, we take an approach where we begin by taking a cumulative sum over the entire column of the A_con
column (True values evaluate to 1) with df['A_con'].cumsum()
. This obviously over counts as we must reset to 0 for any False values in A_con
and begin counting again when True.
To do this, the entire cumulative sum is subtracted whenever A_con
is False. But, when A_con
is True, only the accumulation up to the last False needs to be subtracted so that the counting can continue. This is done again by replacing all the True values (now evaluated as 0 with 1 - a_cum.diff()
) by forward filling with the last cumulative sum when A_con
was False.
# create a little more data to test
df=pd.DataFrame({'A': [1]*3 + [2]*3 + [1]*4 + [3]*5 + [2.2]*3 + [2.4]*3,
'B': [1.5]*2 + [2]*4 + [1.5]*5 + [3.2]*4 + [2.2]*3 + [2.4]*3})
df['A_con'] = df['A'].diff().replace(0, method='ffill') > 0
a_cum = df['A_con'].cumsum()
a_cum_sub = (a_cum * (1 - a_cum.diff())).replace(0, method='ffill').fillna(0)
df['days_A'] = a_cum - a_cum_sub
df['B_con'] = df['B'].diff().replace(0, method='ffill') > 0
b_cum = df['B_con'].cumsum()
b_cum_sub = (b_cum * (1 - b_cum.diff())).replace(0, method='ffill').fillna(0)
df['days_B'] = b_cum - b_cum_sub
With output
A B A_con days_A B_con days_B
0 1.0 1.5 False 0.0 False 0.0
1 1.0 1.5 False 0.0 False 0.0
2 1.0 2.0 False 0.0 True 1.0
3 2.0 2.0 True 1.0 True 2.0
4 2.0 2.0 True 2.0 True 3.0
5 2.0 2.0 True 3.0 True 4.0
6 1.0 1.5 False 0.0 False 0.0
7 1.0 1.5 False 0.0 False 0.0
8 1.0 1.5 False 0.0 False 0.0
9 1.0 1.5 False 0.0 False 0.0
10 3.0 1.5 True 1.0 False 0.0
11 3.0 3.2 True 2.0 True 1.0
12 3.0 3.2 True 3.0 True 2.0
13 3.0 3.2 True 4.0 True 3.0
14 3.0 3.2 True 5.0 True 4.0
15 2.2 2.2 False 0.0 False 0.0
16 2.2 2.2 False 0.0 False 0.0
17 2.2 2.2 False 0.0 False 0.0
18 2.4 2.4 True 1.0 True 1.0
19 2.4 2.4 True 2.0 True 2.0
20 2.4 2.4 True 3.0 True 3.0
Upvotes: 1