Reputation: 2064
I have a dataframe consisting of 1 column of periods (year and quarters) and another column of some productivity numbers for that period. My task is to identify a period where, for example, I have two consecutive quarters of productivity decline; or, similarly, two consecutive quarters of growth. I imagine I can use brute force and just loop over the rows looking at several rows at a time, but I was reading that this might have something to do with a "shift" function -- but I dont understand how it works. Thank you for your help
1971q1 1,137.8 1971q2 1,159.4 1971q3 1,180.3 1971q4 1,173.6 1972q1 1,163.8 1972q2 1,140.1 1972q3 1,145.8 1972q4 1,150.0
Upvotes: 1
Views: 547
Reputation: 210832
It's still not quite clear to me how your desired data set should look like.
Here is a method which helps to identify grows and drops:
In [450]: df
Out[450]:
period val
0 1971q1 1137.8
1 1971q2 1159.4
2 1971q3 1180.3
3 1971q4 1173.6
4 1972q1 1163.8
5 1972q2 1140.1
6 1972q3 1145.8
7 1972q4 1150.0
In [451]: np.sign(df.val.diff().fillna(0))
Out[451]:
0 0.0
1 1.0
2 1.0
3 -1.0
4 -1.0
5 -1.0
6 1.0
7 1.0
Name: val, dtype: float64
In [452]: df.loc[np.sign(df.val.diff().fillna(0)) < 0]
Out[452]:
period val
3 1971q4 1173.6
4 1972q1 1163.8
5 1972q2 1140.1
Upvotes: 1
Reputation: 19375
try this buddy
#define a growth rate
df['growth_rate'] = np.log(df.production) - np.log(df.production).shift(1)
#a recession is when there have been two quarters of negative growth.
df['recession'] = (df['growth_rate'] < 0 ) & (df['growth_rate'].shift(1) < 0 )
Upvotes: 2