Reputation: 521
I have two tables. One stores "locations":
TABLE location (
ID serial PRIMARY KEY,
name text NOT NULL,
description text NOT NULL
);
Each location has many rows of "data":
TABLE data(
ID smallint REFERENCES location(ID),
date date,
rainfall int
);
I would like to find all locations that have "data" spanning a given period, and which have at least "min" number of values within that period. I have tried this:
SELECT location.ID, location.name
FROM location
JOIN data
ON data.id = location.id
GROUP BY location.id
HAVING MIN(data.date) <= '$start_date'
AND
MAX(data.date) >= '$end_date'
AND
(SELECT COUNT(*) FROM data WHERE data.date >= '$start_date' AND data.date <= '$end_date') >= '$min'
ORDER BY location.ID
If I take out the second last line (the SELECT COUNT line) it correctly returns locations that have data spanning the desired period (but doesn't have the constraint that the number of values within the desired period is greater than or equal to "min").
Can anyone tell me how to impose the constraint? ie. what is wrong with my "SELECT COUNT line".
The following example data may help clarify my problem:
Example data:
location:
ID = 1, name = "London", description = "test location 1"
ID = 2, name = "New York", description = "test location 2"
data:
ID = 1, date = 2001-01-01, rainfall = 0.0
ID = 1, date = 2001-01-02, rainfall = 0.0
ID = 1, date = 2001-01-03, rainfall = 0.0
ID = 1, date = 2001-01-04, rainfall = 0.0
ID = 1, date = 2001-01-05, rainfall = 0.0
ID = 1, date = 2001-01-06, rainfall = 0.0
ID = 1, date = 2001-01-07, rainfall = 0.0
ID = 2, date = 2001-01-01, rainfall = 0.0
ID = 2, date = 2001-01-04, rainfall = 0.0
ID = 2, date = 2001-01-05, rainfall = 0.0
ID = 2, date = 2017-01-01, rainfall = 0.0 # Not within the desired period, so is excluded
ID = 2, date = 2017-01-02, rainfall = 0.0 # Not within the desired period, so is excluded
ID = 2, date = 2017-01-03, rainfall = 0.0 # Not within the desired period, so is excluded
ID = 2, date = 2017-01-04, rainfall = 0.0 # Not within the desired period, so is excluded
If I search for all locations with data between 2001-01-01 and 2001-01-07, and which have at least 6 data values, it should only return location 1 (ID=1). The second location (ID=2) should not be returned as it does not have the required number of values within the desired period.
Upvotes: 1
Views: 874
Reputation: 51971
This is a new version of my answer since I had misunderstood 'spanning'. My interpretation of the question is still though that when counting values we should only do it within the given period
SELECT l.id, l.name
FROM location l
JOIN location_data d
ON l.id = d.id
GROUP BY l.id, l.name
HAVING MIN(d.date) <= '2017-01-01'
AND MAX(d.date) >= '2017-12-31'
AND (SELECT COUNT(b.id)
FROM location_data b
WHERE b.date BETWEEN '2017-01-01' AND '2017-12-31'
AND b.id = l.id) >= 2
I renamed the table data to location_data in my test db but that might be obvious :)
Upvotes: 2
Reputation: 1269933
First, you should be passing in values such as dates as parameters, not as strings. Second, COUNT()
returns a number, so the comparison should be to a number, not a string.
You should be able to do what you want with group by
and having
. One method is:
SELECT l.ID, l.name
FROM location l JOIN
data d
ON d.id = l.id
GROUP BY l.id, l.name
HAVING MIN(d.date) <= '$start_date' AND
MAX(d.date) >= '$end_date' AND
COUNT(*) >= $min
ORDER BY l.ID;
Upvotes: 1