tableadventure
tableadventure

Reputation: 23

Why do these two SELECT COUNT(*) with negated WHERE clauses not sum up correctly?

I'm selecting rows from a table with a where clause but negating the where clause returns more rows than it should.

Given a table with 1000 rows a select where returns 600 rows, shouldn't the negated version of that select where return 400 rows?

select count(*) from trips

The trips table has 420444 rows.

I'm selecting all trips that start in the weekend

select count(*) from trips
where service_id in
(select service_id from calendar_dates
where date_part('dow', date) in (5, 6))

returns 363272

Running the same query but for trips not started on the weekend

select count(*) from trips
where service_id in
(select service_id from calendar_dates
where date_part('dow', date) not in (5, 6))

returns 377326

363272 + 377326 = 740598 which is way more than 420444

Does the count function act differently when there is a subquery in a where clause involved?

This is done on a database with GTFS data https://developers.google.com/transit/gtfs/. I can't figure out what I've missed.

Upvotes: 0

Views: 82

Answers (1)

FXD
FXD

Reputation: 2060

It is not negated.

If you have the same service_id with dow = 3, 4, 5, 6, then it will appear in the 2 counts.

The correct negation is

select count(*) from trips
where service_id NOT in
(select service_id from calendar_dates
where date_part('dow', date) in (5, 6))

Or equivalent with not exists

Upvotes: 4

Related Questions