Reputation: 457
I have a requirement that is throwing me for a loop.I have to return location IDs that have exactly 3 consecutive rows of monthly invoice amounts >$2,000. In other words, I do not want to return IDs for mature locations (that might have hundreds of monthly invoices rows).
Rextester example data: http://rextester.com/CNJC15871
Info:
Desired Output: In the invoice table below,
For subsequent report months, 2223, 3344, and 6678 should NOT be returned because they have >3 consecutive months >$2,000.
| loc_id | invoice_date | invoice_amt | Notes |
|--------|--------------|-------------|---------------------------|
| 1234 | 5/15/2002 | 7000 | |
| 1234 | 6/15/2002 | 8000 | |
| .. | … | … | |
| 1234 | 11/15/2017 | 58000 | |
| | | | |
| 9987 | 11/15/2006 | 7500 | |
| 9987 | 12/15/2006 | 8500 | |
| … | … | | |
| 9987 | 11/15/2017 | 63000 | |
| | | | |
| 5544 | 3/15/2015 | 9200 | |
| 5544 | 4/15/2015 | 10000 | |
| … | … | | |
| 5544 | 11/15/2017 | 70000 | |
| | | | |
| 2223 | 5/15/2017 | 2500 | Count| >2000 |
| 2223 | 6/15/2017 | 1375 | Do not count| <2000 |
| 2223 | 7/15/2017 | 8000 | Restart count| >2000 (1) |
| 2223 | 8/15/2017 | 9000 | Continue count| >2000 (2) |
| 2223 | 9/15/2017 | 9800 | Continue count| >2000 (3) |
| 2223 | 10/15/2017 | 10500 | Stop count| >3 in a row |
| 2223 | 11/15/2017 | 11200 | Stop count| >3 in a row |
| | | | |
| 3344 | 7/15/2017 | 3500 | Count| >2000 (1) |
| 3344 | 8/15/2017 | 4500 | Continue count| >2000 (2) |
| 3344 | 9/15/2017 | 6000 | Continue count| >2000 (3) |
| 3344 | 10/15/2017 | 7000 | Stop count| >3 in a row |
| 3344 | 11/15/2017 | 8000 | Stop count| >3 in a row |
| | | | |
| 6678 | 8/15/2017 | 3000 | Count| >2000 (1) |
| 6678 | 9/15/2017 | 4000 | Continue count| >2000 (2) |
| 6678 | 10/15/2017 | 5000 | Continue count| >2000 (3) |
I also have a loc_id dimension that includes the location open date.
| loc_id | loc_open_dt |
|--------|-------------|
| 1234 | 2002-05-01 |
| 9987 | 2006-10-22 |
| 5544 | 2015-03-04 |
| 2223 | 2017-05-05 |
| 3344 | 2017-07-05 |
| 6678 | 2017-08-01 |
Upvotes: 0
Views: 136
Reputation: 351308
Here is query that checks that:
Query:
select distinct loc_id
from (
select loc_id,
first_value(invoice_amt) over win first_amt,
floor((list_date - first_value(invoice_date) over win)/30)+1 month_count,
list_date - last_value(invoice_date) over win < 30 has_last_month,
count(case when invoice_amt >= 2000 then 1 end) over win large_amt_count
from invoices,
(select date '2017-10-01' /* current_date */ list_date) ref
where invoice_date between (list_date - 120) and list_date
window win as (partition by loc_id order by invoice_date)
) base
where month_count = 3 + (first_amt < 2000)::int
and large_amt_count = 3
and has_last_month;
See it run on rextester
Change the literal date in the middle of the query to your actual reporting date (or current_date
).
Upvotes: 1
Reputation: 3391
In PostgreSQL you can use a window function. You need to build the windows by aggregating your data by loc_id
and then check that 3 conscutive rows have an invoice_amt
greater than the target value. The trick is done by using the lag()
function that, applied over a window, let you grab data from the previous rows. The code is much simpler than the explanation:
SELECT DISTINCT loc_id FROM (
SELECT *,
invoice_amt > 10000 AS a,
lag(invoice_amt, 1) OVER w > 10000 AS b,
lag(invoice_amt, 2) OVER w > 10000 AS c,
extract('month' from invoice_date::date) AS m1,
extract('month' from (lag(invoice_date, 1) OVER w)::date + '1 month'::interval) AS m2,
extract('month' from (lag(invoice_date, 2) OVER w)::date + '2 month'::interval) AS m3
FROM invoices
WINDOW w AS (PARTITION BY loc_id ORDER BY invoice_date)
) X
WHERE a AND b AND c AND m2 = m1 AND m3 = m1
Note also the check on consecutive months. We just add 1 or 2 months to the lag()
ged date and then check that the month is same for the three consecutive rows (as asked in the comments).
If you want to better understand how that works just run only the inner SELECT
and look at the results.
Upvotes: 1