SG Tech Edge
SG Tech Edge

Reputation: 507

Two dimensional comparison in sql

DB schema

CREATE TABLE newsletter_status
(
    cryptid varchar(255) NOT NULL,
    status varchar(25),
    regDat timestamp,
    confirmDat timestamp,
    updateDat timestamp,
    deleteDat timestamp
);

There are rows with the same cryptid, I need to squash them to one row. So the cryptid becomes effectively unique. The complexity comes from the fact that I need to compare dates by rows as well as by columns. How to implement that?

The rule I need to use is:

Example:

002bc5 | new         | 2010.01.15 | 2001.01.15 | NULL       | 2020.01.10
002bc5 | confirmed   | NULL       | 2020.01.30 | 2020.01.15 | 2020.01.15
002bc5 | deactivated | NULL       | NULL       | NULL       | 2020.12.03

needs to be squashed into:

002bc5 | deactivated | 2010.01.15 | 2020.01.30 | 2020.01.15 | 2020.12.03

The status deactivated is taken because the timestamp 2020.12.03 is the latest

Upvotes: 0

Views: 75

Answers (3)

astentx
astentx

Reputation: 6751

What you need to get the status is to sort rowset by dates in descending order. In Oracle there is agg_func(<arg>) keep (dense_rank first ...), in other databases it can be replaced with row_number() and filter. Because analytic functions in HANA works not so good sometimes, I suggest to use the only one aggregate function I know in HANA that supports ordering inside - STRING_AGG - with little trick. If you have not a thousands of rows with statuses (i.e. concatenated status will not be greater 4000 for varchar), it will work. This is the query:

select
  cryptid,
  max(regDat) as regDat,
  max(confirmDat) as confirmDat,
  max(updateDat) as updateDat,
  max(deleteDat) as deleteDat,
  substr_before(
    string_agg(status, '|'
      order by greatest(
        ifnull(regDat, date '1000-01-01'),
        ifnull(confirmDat, date '1000-01-01'),
        ifnull(updateDat, date '1000-01-01'),
        ifnull(deleteDat, date '1000-01-01')
      ) desc),
    '|'
  ) as status
from newsletter_status
group by cryptid

Upvotes: 1

GMB
GMB

Reputation: 222482

I would start by ranking the rows of each cryptid by the greatest value of the date column. Then we can use that information to identify the latest status per cryptid, and aggregate :

select cryptid, 
    max(case when rn = 1 then status end) as status,
    max(regDate) as regDat,
    max(confirmDat) as confirmDat,
    max(updatedDat) as updatedDat,
    max(deleteDat) as deleteDat
from (
    select ns.*, 
        row_number() over(
            partition by cryptid 
            order by greatest(
                coalesce(regDate,    '0001-01-01'), 
                coalesce(confirmDat, '0001-01-01'),
                coalesce(updatedDat, '0001-01-01'), 
                coalesce(deleteDat,  '0001-01-01')
            )
        ) rn 
    from newsletter_status ns
) ns
group by cryptid

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269843

You can use aggregation:

select cryptid,
       coalesce(max(case when status = 'deactivated' then status end)
                max(case when status = 'confirmed' then status end),
                max(case when status = 'new' then status end),
               ) as status,
       max(regDat),
       max(confirmDat),
       max(updateDat),
       max(deleteDat)
from newsletter_status
group by cryptid;

The coalesce()s are a trick to get the statuses in priority order.

EDIT:

If you just want the row with the latest timestamp:

select cryptid,
       max(case when seqnum = 1 then status end) as status_on_max_date,
       max(regDat),
       max(confirmDat),
       max(updateDat),
       max(deleteDat)
from (select ns.*,
             row_number() over (partition by cryptid
                                order by greatest(coalesce(regDat, '2000-01-01'),
                                              coalesce(confirmDat, '2000-01-01'),
                                               coalesce(updateDat, '2000-01-01'),
                                               coalesce(deleteDat, '2000-01-01')
                                         )
                               ) as seqnum
                      
      from newsletter_status ns
     ) ns
group by cryptid;

Upvotes: 1

Related Questions