bigleftie
bigleftie

Reputation: 453

SQL to find number of NULL and non-NULL entries for a column

For each POSTAL_CODE, I want to know how many NULL TIME_VISITEDs there are and how many NOT NULL TIME_VISITEDs

CREATE TABLE VISITS
(
 ID INTEGER NOT NULL,
 POSTAL_CODE VARCHAR(5) NOT NULL,
 TIME_VISITED TIMESTAMP,
 CONSTRAINT PK_VISITS PRIMARY KEY (ID)
);

Sample data:

INSERT INTO VISITS (ID, POSTAL_CODE, TIME_VISITED) VALUES ('234', '01910', '21.04.2014, 10:13:33.000');
INSERT INTO VISITS (ID, POSTAL_CODE, TIME_VISITED) VALUES ('334', '01910', '28.04.2014, 13:13:33.000');
INSERT INTO VISITS (ID, POSTAL_CODE, TIME_VISITED) VALUES ('433', '01910', '29.04.2014, 13:03:19.000');
INSERT INTO VISITS (ID, POSTAL_CODE, TIME_VISITED) VALUES ('533', '01910', NULL);
INSERT INTO VISITS (ID, POSTAL_CODE, TIME_VISITED) VALUES ('833', '01910', NULL);

This is the output I want for the data above:

POSTAL_CODE=01910, NUM_TIME_VISITED_NULL=2, NUM_TIME_VISITED_NOT_NULL=3

I am using the following SQL

SELECT distinct r.POSTAL_CODE, 
   (select count(*) from VISITS p where p.POSTAL_CODE=r.POSTAL_CODE and p.TIME_VISITED is null) as NUM_TIME_VISITED_NULL,
   (select count(*) from VISITS p where p.POSTAL_CODE=r.POSTAL_CODE and p.TIME_VISITED is not null) as NUM_TIME_VISITED_NOT_NULL
FROM VISITS r
ORDER BY r.POSTAL_CODE

The query takes a very long time if there are lots of rows in the table

What changes do I need to make to be able to get this information more quickly?

Upvotes: 1

Views: 157

Answers (2)

woot
woot

Reputation: 7606

You can do this all in one pass. COUNT counts how many non-NULLs there are. Then use SUM of a CASE statement to count up all the NULLs.

SELECT POSTAL_CODE
      ,COUNT(TIME_VISITED) AS NUM_TIME_VISITED_NOT_NULL
      ,SUM(CASE WHEN TIME_VISITED IS NULL THEN 1 ELSE 0 END)) AS NUM_TIME_VISITED_NULL
FROM VISITS
GROUP BY POSTAL_CODE

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269563

Use conditional aggregation instead:

select v.postal_code,
       sum(case when v.time_visited is null then 1 else 0
        end) as NumTimeVisitedNull,
       count(v.time_visited) as NumTimeVisitedNotNull
from visits v
group by v.postal_code;

Note: you can also write this as:

select v.postal_code,
       (count(*) - count(v.time_visited) ) as NumTimeVisitedNull,
       count(v.time_visited) as NumTimeVisitedNotNull
from visits v
group by v.postal_code;

The count() function specifically counts the number of non-NULL values.

Upvotes: 3

Related Questions