pafetos110
pafetos110

Reputation: 1

SQL- selecting rows where one value is greater than the first value date

I have the following table in SQL-

user_id    purchase_date      revenues  
-------------------------------------
A           2018-01-01          30
A           2018-01-02          28
A           2018-01-04          45
A           2018-01-06          55
B           2018-01-03          20
B           2018-01-05          21   

And I have to select only the rows in which the revenue from a user was greater than the revenue on their first day.

Meaning the expected result should be-

user_id    purchase_date   revenues 
-------------------------------------
A           2018-01-04          45
A           2018-01-06          55
B           2018-01-05          21 

I'm just starting to learn SQL, and I can do each query (by date or revenue per user) on its own, but I can't seem to get the hang of it with this complexity.

Thanks a lot!

Upvotes: 0

Views: 1858

Answers (3)

allmhuran
allmhuran

Reputation: 4474

A cross apply will make your life a lot easier here. This is probably the "easiest to understand" (but not the most efficient) way of writing the query, because it aligns very naturally with the "in plain english" way of writing the question.

The cross apply operator lets you write a select statement, and then "apply" that select statement to each row on the left hand side.

In "plain English" (but written using more SQL-ish terms), you want

the rows where the revenues for a user_id is greater than the revenues for that user_id on their top 1 purchase_date when ordered by date ascending.

Let's tackle the second part of that. For a user_id, get the top 1 revenues ordered by purchase_date ascending:

select    top 1 revenues 
from      MyTable 
where     user_id = -- [just one particular user id]
order by  purchase_date asc

OK, but we need that for every user. So we use cross apply:

select       t.user_id,
             t.purchase_date,
             t.revenues,
             first_day_revenues_by_user.revenues
from         MyTable    t
cross apply  (
                select    top 1 revenues
                from      MyTable 
                where     user_id = t.user_id -- use the user_id from the outer select
                order by  purchase_date asc
             ) 
             as first_day_revenues_by_user -- give the subquery an alias we can use

Now we have a select statement which has all the columns of the original table, plus an additional column containing the revenues value for their first day. So now we can add a simple where clause:

select       t.user_id,
             t.purchase_date,
             t.revenues
from         MyTable    t
cross apply  (
                select    top 1 revenues
                from      MyTable
                where     user_id = t.user_id
                order by  purchase_date asc
             ) 
             as first_day_revenues_by_user 
where        t.revenues > first_day_revenues_by_user.revenues

Upvotes: 0

Dmitry Kolchev
Dmitry Kolchev

Reputation: 2216

You can use windowed function ROW_NUMBER()

with A as
(
    select user_id, purchase_date, revenues
    from 
        (values
        ('A',           '2018-01-01',          30),
        ('A',           '2018-01-02',          28),
        ('A',           '2018-01-04',          45),
        ('A',           '2018-01-06',          55),
        ('B',           '2018-01-03',          20),
        ('B',           '2018-01-05',          21)) as T(user_id, purchase_date, revenues)   
),
B as (
    select 
        ROW_NUMBER() over (partition by user_id order by purchase_date) as n, 
        user_id, purchase_date, revenues from A
),
C as (
    select * from B where n = 1
)
select
    A.user_id, A.purchase_date, A.revenues
from
    A inner join C on (A.user_id = C.user_id and A.revenues > C.revenues)

also you can use FIRST_VALUE function

with A as
(
    select user_id, purchase_date, revenues
    from 
        (values
        ('A',           '2018-01-01',          30),
        ('A',           '2018-01-02',          28),
        ('A',           '2018-01-04',          45),
        ('A',           '2018-01-06',          55),
        ('B',           '2018-01-03',          20),
        ('B',           '2018-01-05',          21)) as T(user_id, purchase_date, revenues)   
),
B as (
    select 
        user_id, purchase_date, revenues, 
        FIRST_VALUE(revenues) over (partition by user_id order by purchase_date) as n
    from A
    )
select user_id, purchase_date, revenues from B where revenues > n;

Upvotes: 2

Fahmi
Fahmi

Reputation: 37493

You can try below using self join

DEMO

select distinct t1.* from t t1 
inner join t t2 
on t1.user_id=t2.user_id and t1.purchase_date>t2.purchase_date 
and t1.revenues>t2.revenues

OUTPUT:

user_id purchase_date   revenues
  A     2018-01-04       45
  A     2018-01-06       55
  B     2018-01-05       21

Upvotes: 0

Related Questions