rshar
rshar

Reputation: 1477

Selectively filter rows where column value is not null PostgreSQL

I have following table in Postgres 11.0.

col1        col2        col3                                                       col4             col5
1           NCT02582996 acetaminophen+caffeine+dihydroergotamine+metoclopramide.    metoclopramide. 204756
1           NCT02582996 acetaminophen+caffeine+dihydroergotamine+metoclopramide.    metoclopramide. null
3           NCT02582997 acetaminophen                                               metoclopramide. 204758
4           NCT02582998 ibuprufin                                                   ibuprufin       null

For same col1 value, I would like to keep only those rows where col5 is not null. and discard the other row.

Desired output is:

col1        col2        col3                                                       col4             col5
1           NCT02582996 acetaminophen+caffeine+dihydroergotamine+metoclopramide.    metoclopramide. 204756
3           NCT02582997 acetaminophen                                               metoclopramide. 204758
4           NCT02582998 ibuprufin                                                   ibuprufin       null
select * from tbl where col5 is not null  --excluding row 4

I am not sure how can I achieve this. Any help is highly appreciated.

Upvotes: 0

Views: 745

Answers (2)

wildplasser
wildplasser

Reputation: 44230

CREATE TABLE tbl
        ( col1 integer
        , col2 text
        , col3 text
        , col4 text
        , col5 integer
        );
INSERT INTO tbl
VALUES
(1, 'NCT02582996', 'acetaminophen+caffeine+dihydroergotamine+metoclopramide.', 'metoclopramide.', 204756)
,(1, 'NCT02582996','acetaminophen+caffeine+dihydroergotamine+metoclopramide.', 'metoclopramide.',NULL)
,(3, 'NCT02582997','acetaminophen', 'metoclopramide.', 204758)
,(4, 'NCT02582998','ibuprufin',NULL,NULL)
        ;

select * from tbl t1;

select * from tbl t1
where col5 is not null
OR NOT EXISTS (
        select * from tbl x
        WHERE x.col1 = t1.col1
        and x.col5 is not null
        );

Results:


CREATE TABLE
INSERT 0 4
 col1 |    col2     |                           col3                           |      col4       |  col5  
------+-------------+----------------------------------------------------------+-----------------+--------
    1 | NCT02582996 | acetaminophen+caffeine+dihydroergotamine+metoclopramide. | metoclopramide. | 204756
    1 | NCT02582996 | acetaminophen+caffeine+dihydroergotamine+metoclopramide. | metoclopramide. |       
    3 | NCT02582997 | acetaminophen                                            | metoclopramide. | 204758
    4 | NCT02582998 | ibuprufin                                                |                 |       
(4 rows)

 col1 |    col2     |                           col3                           |      col4       |  col5  
------+-------------+----------------------------------------------------------+-----------------+--------
    1 | NCT02582996 | acetaminophen+caffeine+dihydroergotamine+metoclopramide. | metoclopramide. | 204756
    3 | NCT02582997 | acetaminophen                                            | metoclopramide. | 204758
    4 | NCT02582998 | ibuprufin                                                |                 |       
(3 rows)

Upvotes: 1

GMB
GMB

Reputation: 222402

I think you can use use distinct on:

select distinct on (col1) t.*
from tbl t
order by col1, col5

This generates one row per distinct value in col1 only. If col1 has both a non-null and null value in col5, then the non-null value is preserved. If there is only one row available, with a null value, then it is selected.

Upvotes: 2

Related Questions