Reputation: 69
I have the following Dataframe, here is a sample:
Area Num Month Year Type Result
AA 1 Jan 2015 A 1
AA 1 Feb 2015 A 2
AA 1 Mar 2015 A 3
AA 1 Apr 2015 A 4
AA 1 May 2015 A 5
AA 1 Jun 2015 A 6
AA 1 Jul 2015 A 7
AA 1 Aug 2015 A 8
AA 1 Sep 2015 A 9
AA 1 Oct 2015 A 10
AA 1 Nov 2015 A 11
AA 1 Dec 2015 A 12
BB 2 Jan 2015 B 1
BB 2 Feb 2015 B 2
BB 2 Mar 2015 B 3
BB 2 Apr 2015 B 4
BB 2 May 2015 B 5
BB 2 Jun 2015 B 6
BB 2 Jul 2015 B 7
BB 2 Aug 2015 B 8
BB 2 Sep 2015 B 9
BB 2 Oct 2015 B 10
BB 2 Nov 2015 B 11
BB 2 Dec 2015 B 12
I need to groupby by Columns: Area
, Num
, Type
, Year
and calculate 90th percentile for each of these groups. I have multiple Num
and Type
for each Area
.
My expected output is:
I need to calculate 12 month percentile value for previous twelve months for each of this group and paste the output in next month.
For example: Calculate the 90th percentile values for the group with {Area:AA,Num:1,Year:2015,Type:A} and assign it to {Area:AA,Num:1,Year:2016,Type:A,Month:Jan}.
Area Num Month Year Type Result
AA 1 Jan 2016 A 10.9
AA 1 Feb 2016 A 10.99
Now for new group(Month February) {Area:AA,Num:1,Year:2016,Type:A,Month:Feb} should calculate the value of 90th Percentile of Result columnn from Feb 2015 to Jan 2016 and so on.
I am finding solution to calculate percentile and groupby individually, but I am stuck as to how to calculate percentile on this moving window.
Upvotes: 2
Views: 2081
Reputation: 153510
df2 = df.assign(date=pd.to_datetime(df.Year.astype(str).add("-").add(df.Month.astype(str)).add("-").add(str(1))))
df2 = df2.set_index('date')
df2.groupby(['Area','Num','Type'])['Result'].rolling(12,min_periods=3).quantile(.9).reset_index()
Output:
Area Num Type date Result
0 AA 1 A 2015-01-01 NaN
1 AA 1 A 2015-02-01 NaN
2 AA 1 A 2015-03-01 2.0
3 AA 1 A 2015-04-01 3.0
4 AA 1 A 2015-05-01 4.0
5 AA 1 A 2015-06-01 5.0
6 AA 1 A 2015-07-01 6.0
7 AA 1 A 2015-08-01 7.0
8 AA 1 A 2015-09-01 8.0
9 AA 1 A 2015-10-01 9.0
10 AA 1 A 2015-11-01 10.0
11 AA 1 A 2015-12-01 10.0
12 BB 2 B 2015-01-01 NaN
13 BB 2 B 2015-02-01 NaN
14 BB 2 B 2015-03-01 2.0
15 BB 2 B 2015-04-01 3.0
16 BB 2 B 2015-05-01 4.0
17 BB 2 B 2015-06-01 5.0
18 BB 2 B 2015-07-01 6.0
19 BB 2 B 2015-08-01 7.0
20 BB 2 B 2015-09-01 8.0
21 BB 2 B 2015-10-01 9.0
22 BB 2 B 2015-11-01 10.0
23 BB 2 B 2015-12-01 10.0
Upvotes: 1