bossangelo
bossangelo

Reputation: 87

How to Calculate increased amount for each period in Pandas

I have a df that contains JIRA ticket status summany for each time period, it contains the counts of 'open', 'close' and 'other'. I want to see the increased amount of over tickets period over period.

period                              Status  Counts
No. 1 Apr 06 2019 to Apr 12 2019    CLOSE   1026
No. 1 Apr 06 2019 to Apr 12 2019    OPEN    2914
No. 1 Apr 06 2019 to Apr 12 2019    OTHER   264
No. 2 Mar 30 2019 to Apr 05 2019    CLOSE   1307
No. 2 Mar 30 2019 to Apr 05 2019    OPEN    2212
No. 2 Mar 30 2019 to Apr 05 2019    OTHER   256 

for Period No.1 the count of OPEN status has increased from 2212(period No2) to 2914, so 702 ticket added for period No1. How do I add and extra colmun that shows.

period                              Status  Counts   Added
No. 1 Apr 06 2019 to Apr 12 2019    CLOSE   1026     702 (2914-2212)
No. 1 Apr 06 2019 to Apr 12 2019    OPEN    2914     702 
No. 1 Apr 06 2019 to Apr 12 2019    OTHER   264      702 
No. 2 Mar 30 2019 to Apr 05 2019    CLOSE   1307     (2212 minus  xxx)
No. 2 Mar 30 2019 to Apr 05 2019    OPEN    2212     (2212 minus  xxx)
No. 2 Mar 30 2019 to Apr 05 2019    OTHER   256      (2212 minus  xxx)

Upvotes: 1

Views: 65

Answers (3)

Valdi_Bo
Valdi_Bo

Reputation: 30991

Start from defining a function to be applied below:

def fn(src):
    return src.query("Status == 'OPEN'").Counts

Then, by application of this function:

df2 = df.groupby('period').apply(fn).diff(-1)\
    .fillna(0, downcast='infer')\
    .reset_index(level=1, drop=True).to_frame('Added')

you get a DataFrame with Added column.

And the last step is to merge both DataFrames:

pd.merge(df, df2, left_on='period', right_index=True)

Upvotes: 0

bossangelo
bossangelo

Reputation: 87

df['added'] = df.loc[df.Status== 'OPEN']['Counts'].diff(periods = -1)
df['added'] = df.groupby(['period'])['added'].bfill()
df['added'] = df.groupby(['period'])['added'].ffill()

use diff() function and use backwards and forward fill functions to fill NA.

Upvotes: 0

user3483203
user3483203

Reputation: 51155

You can locate the differences in OPEN, and then make use of transform('first') to fit these values back into the frame.

u = df.assign(Added=df.loc[df.Status.eq('OPEN'), 'Counts'].diff(-1))

u.assign(Added=u.groupby('period')['Added'].transform('first'))

                             period Status  Counts  Added
0  No. 1 Apr 06 2019 to Apr 12 2019  CLOSE    1026  702.0
1  No. 1 Apr 06 2019 to Apr 12 2019   OPEN    2914  702.0
2  No. 1 Apr 06 2019 to Apr 12 2019  OTHER     264  702.0
3  No. 2 Mar 30 2019 to Apr 05 2019  CLOSE    1307    NaN
4  No. 2 Mar 30 2019 to Apr 05 2019   OPEN    2212    NaN
5  No. 2 Mar 30 2019 to Apr 05 2019  OTHER     256    NaN

Upvotes: 2

Related Questions