Reputation: 1702
My table scheme looks like this
id | user_id | price | date
1235085 | 429009 | 1301.3 | 2016-01-01
1235016 | 1106100 | 2343.6 | 2016-01-01
1235007 | 707164 | 980.7 | 2016-01-01
there are 20 million records.
I have to find users which are made some orders in any year, but didn't the following year.
I tried use this query
select user_id
from orders o1
where not exists (select user_id from orders o2
where extract(year from o2.date) + 1 > extract(year from o1.date))
but it doesn't work
Upvotes: 0
Views: 169
Reputation: 1269623
Here is one method:
select user_id, yyyy
from (select user_id, date_trunc('year', date) as yyyy,
lead(date_trunc('year', date)) over (partition by user_id order by date_trunc('year', date)) as next_year
from t
group by user_id, yyyy
) u
where next_year <> yyyy + interval '1 year' or next_year is null;
This assumes that you actually want the year as well. If not, use select distinct user_id
.
You might also want to add the condition yyyy <> date_trunc(now())
so you don't get users who made their first purchase this year. Without this condition, I think you will return all users, because every user has a "last purchase" with no purchases the following year.
EDIT:
Interestingly, you can do this with lead()
as well:
select user_id, date
from (select t.*, lead(date) over (partition by user_id order by date) as next_date
from t
) t
where (next_date is null or
extract(year from next_date) <> extract(year from date) + 1
) and
date < date_trunc('year', now());
Because lead()
orders the values, this should return at most one value for a given year, even when there are multiple orders in a year.
Upvotes: 0
Reputation: 164089
Use EXCEPT:
select distinct user_id from orders
except
select distinct user_id
from orders o1
where exists(
select 1
from orders o2
where o2.user_id = o1.user_id
and extract(year from o2.date) + 1 = extract(year from o1.date)
)
Upvotes: 1