RustyShackleford
RustyShackleford

Reputation: 3667

How to "Pick" row based on comparison of value in columns for every ID that is duplicated?

I have a table that looks like this (many more columns):

cid        pid      type
C12         1         E
C34         1         F
C01         1         G
            2         F
            2         G

I want to 'pick' the max 'cid' and hide the row with the lesser value in the table when:

 1. 'cid' is present
 2. 'pid' is the same

If no 'cid' then I dont have anything to 'pick' the max row. 'cid' always begins with 'c', not sure if this causing me to fail since I am trying to pick the max of alphanumeric.

The new table should look like this:

cid        pid      type
C34         1         F
            2         F
            2         G

I have tried doing:

select max(cid) as cid from table. 

but it's not working.

Also tried group by, but I have 50 columns and it's still not working:

select max(cid) as cid, pid,type
from table
group by cid, pid, type

When I try to group by just 'pid', I get an error, that I need to eventually group all columns.

---edit----

I am open to removing the join in the table that is creating the 'cid' in the final table out and doing a separate join if that helps. Ideally I would like to be able to 'pick' the row I want from the existing table.

Upvotes: 0

Views: 61

Answers (3)

Svetlin Zarev
Svetlin Zarev

Reputation: 15673

With window functions:

select cid, pid, kind
from (
         select cid, pid, kind, row_number() over (partition by pid order by cid desc nulls last ) as rn
         from demo
     ) as T
where cid is null or rn = 1;

DB Fiddle: https://www.db-fiddle.com/f/kPjgji1w2hm51JaBFGQzCL/0

Upvotes: 0

forpas
forpas

Reputation: 164069

With NOT EXISTS:

select t.* from tablename t
where t.cid is null
or not exists (
  select 1 from tablename
  where pid = t.pid and cid > t.cid
) 

See the demo.
Results:

| cid | pid | type |
| --- | --- | ---- |
| C34 | 1   | F    |
|     | 2   | F    |
|     | 2   | G    |

Upvotes: 1

S-Man
S-Man

Reputation: 23676

demo:db<>fiddle

SELECT 
    *
FROM (
    SELECT DISTINCT ON (pid)
        *
    FROM
        mytable
    WHERE cid IS NOT NULL
    ORDER BY pid, cid DESC
) s

UNION ALL

SELECT 
    *
FROM
    mytable
WHERE cid IS NULL

I split up the query into: cid exists and cid does not exist:

  1. Do get the max of a certain group you can use the DISTINCT ON clause, which always gives the first record of an ordered group. In your case the group is pid and this is ordered by cid in DESCending order to get the highest to the top. This record is the result.
  2. After calculating these records simply UNION the records without any cid value.

Upvotes: 0

Related Questions