Reputation: 133
I'm trying to summary all the data changes from a client's history table (on the columns CIVILITY, LAST_NAME, FIRST_NAME, BIRTH_DATE), for each CLIENT_ID and ordered by date.
Here an example of data for one CLIENT_ID (I've highlighted in yellow all the data changes => and these are 6 lines I want to extract):
Insert into client_history (client_id,source_name,hist_create_date,civility,last_name,first_name,birth_date) values ('1000002 ','SYS ',to_timestamp('19/05/20 13:10:46,508000000','DD/MM/RR HH24:MI:SSXFF'),'1','SPRINTQUINZE','Julien',to_date('24/03/1979','DD/MM/RR'));
Insert into client_history (client_id,source_name,hist_create_date,civility,last_name,first_name,birth_date) values ('1000002 ','SYS ',to_timestamp('19/05/20 13:10:42,801000000','DD/MM/RR HH24:MI:SSXFF'),'1','SPRINTQUINZE','Julien',to_date('24/03/1979','DD/MM/RR'));
Insert into client_history (client_id,source_name,hist_create_date,civility,last_name,first_name,birth_date) values ('1000002 ','ROD ',to_timestamp('19/05/20 13:10:40,901200000','DD/MM/RR HH24:MI:SSXFF'),'1','SPRINTQUINZE','Juliette',to_date('24/03/1979','DD/MM/RR'));
Insert into client_history (client_id,source_name,hist_create_date,civility,last_name,first_name,birth_date) values ('1000002 ','ROD ',to_timestamp('19/05/20 13:10:39,703000000','DD/MM/RR HH24:MI:SSXFF'),'1','SPRINTQUINZE','Julien',to_date('24/03/1979','DD/MM/RR'));
Insert into client_history (client_id,source_name,hist_create_date,civility,last_name,first_name,birth_date) values ('1000002 ','SYS ',to_timestamp('07/05/20 13:32:57,643000000','DD/MM/RR HH24:MI:SSXFF'),'1','SPRINTQUINZE','Julien',to_date('24/03/1979','DD/MM/RR'));
Insert into client_history (client_id,source_name,hist_create_date,civility,last_name,first_name,birth_date) values ('1000002 ','SYS ',to_timestamp('07/05/20 00:17:25,352000000','DD/MM/RR HH24:MI:SSXFF'),'1','SPRINTQUINZE','Julia',to_date('24/03/1979','DD/MM/RR'));
Insert into client_history (client_id,source_name,hist_create_date,civility,last_name,first_name,birth_date) values ('1000002 ','ECA ',to_timestamp('04/05/20 15:15:48,167000000','DD/MM/RR HH24:MI:SSXFF'),'1','SPRINTQUINZE','Julien',to_date('24/03/1979','DD/MM/RR'));
Insert into client_history (client_id,source_name,hist_create_date,civility,last_name,first_name,birth_date) values ('1000002 ','ECA ',to_timestamp('04/05/20 15:15:46,853000000','DD/MM/RR HH24:MI:SSXFF'),'2','SPRINTQUINZE','Julien',to_date('24/03/1979','DD/MM/RR'));
Insert into client_history (client_id,source_name,hist_create_date,civility,last_name,first_name,birth_date) values ('1000002 ','INT ',to_timestamp('02/06/19 11:15:32,931000000','DD/MM/RR HH24:MI:SSXFF'),'2','SPRINTQUINZE','Julien',to_date('24/03/1979','DD/MM/RR'));
I've tried to use analytic functions to do so, but I can't find a way to get the right result. For example with this query:
--select * from (
select
client_id,
source_name,
hist_create_date,
civility,
last_name,
first_name,
birth_date,
row_number() over (partition by client_id, civility, last_name, first_name, birth_date order by hist_create_date asc) rn
from client_history
order by hist_create_date desc
--) where rn=1;
Which returns the lines highlighted in orange (but I want to get the lines which have columns highlighted in yellow):
I understand why this query result fails, but I don't find the right way to reach my goal.
Upvotes: 0
Views: 231
Reputation: 553
If I correctly understood you want to get all rows in which at least one of the four column values differs from the value in previous row when rows are ordered by hist date. You can concatenate values and use lag to compare current row data with the previous one like
select id, source_name, hist_create_date, civility, last_name, first_name, birth_date
from (
select t.*,
civility||'|'||last_name||'|'||first_name||'|'||birth_date cur_val,
lag(civility||'|'||last_name||'|'||first_name||'|'||birth_date, 1, null) over(partition by id order by hist_create_date) prev_val
from test_tab t
) where prev_val is null or prev_val <> cur_val
Upvotes: 1