HunTer
HunTer

Reputation: 87

sum of last 12 months of data where each month has 2 rows of data -- Spark SQL

so my dataset has two columns, one column for the date (that contains month start and month end date), second column contains the count

1-Nov-18    58
30-Nov-18   76
1-Dec-18    93
31-Dec-18   57
1-Jan-19    62
31-Jan-19   78
1-Feb-19    87
28-Feb-19   75
1-Mar-19    54
31-Mar-19   59
1-Apr-19    76
30-Apr-19   79
1-May-19    81
31-May-19   80
1-Jun-19    71
30-Jun-19   52
1-Jul-19    75
31-Jul-19   57
1-Aug-19    76
31-Aug-19   76
1-Sep-19    63
30-Sep-19   57
1-Oct-19    81
30-Oct-19   84
and so on..

i need my output as

Oct2019 1707

basically for month oct2019 i need sum for counts for records from nov'18 to oct'19 (last 12 months), similary for month nov2019 i would need sum of all records from Dec'18 to Nov'19. and so on for other months

Upvotes: 0

Views: 967

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1270593

I think you want aggregation and a window function with RANGE:

select date_trunc(date, 'MM') as mon, SUM(cnt) as month_cnt,
       SUM(SUM(cnt)) OVER (ORDER BY MIN(date)
                           RANGE BETWEEN INTERVAL 11 MONTH PRECEDING AND CURRENT ROW
                          ) as running_12_months
from t
group by date_trunc(date, 'MM');

Upvotes: 1

Sari Rahal
Sari Rahal

Reputation: 1955

I think what you are looking for is a Group By clause with a select count. I can't give you the perfect answer because I need more information about your schema... What is your Database name, Database Table, Column names...

SELECT `dateField`,SUM(`countField`)
    FROM `TableName`
    GROUP BY `dateField`;

You can also do a Group By Months like this:

GROUP BY YEAR('dateField`), MONTH(`dateField`)

Upvotes: 1

Related Questions