Dmitry Volkov
Dmitry Volkov

Reputation: 1337

Detecting the change on the certain column in the history table

Consider the following data:

history.data
=======================================
id |data_id| col1  | col2  | date
---+-------+-------+-------+-----------
1  |1      | 123   | 321   | 2017-08-01
2  |1      | 124   | 321   | 2017-08-03
3  |2      | 222   | 555   | 2017-08-05
4  |2      | 242   | 555   | 2017-08-07
5  |2      | 242   | 333   | 2017-08-11

So this is the history_data table where I keep all the changes in a certain table. Right now I need to get the date of the last change for every present entry of data in the col1 column. In that case desired output should be

data_id | date
--------+-----------
1       | 2017-08-03
2       | 2017-08-07

I need to do this in the following context:

with cte1 as (
    select distinct on(data_id)
    data_id,
    date::date

    from data d
    join history.data hd on hd.data_id = d.id
    order by d.id, hd.date desc
)

So as you can see, for now I'm just getting the date of the last record change, regardless of on what column that change occured.

Could anyone please help me with that?

Upvotes: 2

Views: 293

Answers (3)

FuzzyTree
FuzzyTree

Reputation: 32392

You can use lag() to get the previous prev_col1 value and prev_col1 <> col1 to identify all rows where a change occured:

select distinct on(data_id) * from (
    select lag(col1) over (partition by data_id order by d.id) prev_col1,
    d.id,
    col1,
    data_id,
    date::date
    from data d
    join history.data hd on hd.data_id = d.id
) t where prev_col1 <> col1 or prev_col1 is null
order by id desc

The prev_col1 is null condition is needed for groups with only 1 member per data_id and assumes that the first member qualifies as a change.

Upvotes: 2

Giorgos Betsos
Giorgos Betsos

Reputation: 72165

You can use the following query:

select distinct on(data_id)
       data_id,
       col1
from data d
join history_data hd on d.id = hd.data_id
order by data_id, date desc;

to get the last col1 value per data_id:

data_id col1
-------------
1   124
2   242

Using the above query as a derived table you can join back to the original table to get the earliest date for each group:

select t1.data_id, t1.col1, min(date::date)
from history_data t1
join (
   select distinct on(data_id)
          data_id,
          col1
   from data d
   join history_data hd on d.id = hd.data_id
   order by data_id, date desc
) t2 on t1.data_id = t2.data_id and t1.col1 = t2.col1
group by t1.data_id, t1.col1;

Output:

data_id col1    min
---------------------------
1       124     03.08.2017 
2       242     07.08.2017 

Note: The query will also return data_id groups related to just one col1 value. You need to change the query slightly to filter these rows out, in case you don't need them.

Demo here

Upvotes: 1

Oto Shavadze
Oto Shavadze

Reputation: 42753

select data_id, max(mindt) from (
    select data_id, col1, min(date) as mindt
    from history_data
    group by data_id, col1
) t
group by data_id

Upvotes: 1

Related Questions