Reputation: 17157
Recently I've been informed by a StackOverflow user that using BETWEEN
operator with values of data type timestamp without time zone
should not be used. Below is the quote.
Between means >= and <= and shall not be used with ranges that contain timestamps.
When asked for an explanation of this thesis or a link to Postgres documentation where it states that I've got an answer saying
Why would such a simple thing need a site with documentation. I am sure you can find many anyway if you google (at least my detailed posts on various forums demonstrating the case)
Well I googled. And found nothing that would advise against using this operator with timestamp values. In fact this answer on SO uses them and so does this mailing group post.
I was informed that all these years I was doing it wrong. Is it really the case?
As far as I know Postgres max precision for a timestamp is 1 microsecond
- correct me if I'm wrong. Thus aren't below statements equivalent ?
sample_date BETWEEN x AND y::timestamp - INTERVAL '1 microsecond'
and
sample_date >= x AND sample_date < y
Edit: The sample is just a consideration of the difference. I'm aware of the fact that developers can miss the time part, but assuming one knows how it behaves, why should it not be used? Generally speaking, this is merely a sample, but I'm wondering about the bigger scope. I've been investigating the planner and it seems to be parsing BETWEEN
to >= AND <=
.
Why does one preferably write >= AND <=
than BETWEEN
in the matter of results - not including the time to translate it?
Upvotes: 2
Views: 16364
Reputation: 421
sample_date BETWEEN x AND y::timestamp - INTERVAL '1 microsecond'
... might actually still be broken due to leap seconds. But lets ignore that...
IMO the biggest reason not to use BETWEEN is because what it tells the database to do doesn't actually match the way our brains interpret it. If I say "it happened sometime between Jan 3rd and Jan 6th", most (or at least many) people will translate that into the interval "[Jan 3rd, Jan 6th)", which is definitely NOT what you get with the SQL expression "BETWEEN 'Jan 3rd' AND 'Jan 6th'". So even though that SQL expression has a completely defined meaning, that meaning conflicts with how most/many people read it in English. But if you use >= AND <, there's absolutely no question about what the intention is.
Upvotes: 0
Reputation: 126
I assume this is a reference the SO user would have shared: https://wiki.postgresql.org/wiki/Don't_Do_This
I used BETWEEN with two timestamps in a recent query and it worked but took much longer than expected. So, I can't verify any specific issue except, perhaps, performance; although this could be due to my giant data set.
I retried the same query with the >= timestamp1 AND < timestamp2
pattern and performance was about the same.
I suppose a real concern here might be using timestamps without timezone and being sure you're getting the right records?
Upvotes: 0
Reputation: 246788
There is absolutely nothing wrong with using ts BETWEEN validfrom AND validto
instead of ts >= validform AND ts <= validto
. They are the same.
I can only guess, but I'd say that the warning targets something different, namely whether either of the (identical) clauses above are the right thing to use.
Now this of course depends on what you are trying to do, but very often clauses like this are used to identify the one valid row for a particular timestamp. In that case the clause as above is wrong, because for a value of ts
when the row was changed, you would get two results.
Consider this:
CREATE TABLE names (
id integer PRIMARY KEY,
val text NOT NULL,
validfrom timestamptz NOT NULL,
validto timestamptz NOT NULL
);
INSERT INTO names VALUES (1, 'Smith', '1985-05-02 00:00:00', '2009-01-30 00:00:00');
INSERT INTO names VALUES (2, 'Jones', '2009-01-30 00:00:00', 'infinity');
This is meant to be a historized table of names for a person.
If you use a WHERE
clause like above to query for the name valid at a certain time, it would work well for
SELECT val FROM names
WHERE current_timestamp BETWEEN validfrom AND validto;
But it would do the wrong thing for
SELECT val FROM names
WHERE '2009-01-30' BETWEEN validfrom AND validto;
That is because the end point of the interval of validity for a name is not part of the interval. For this case, it would be correct to write:
SELECT val FROM names
WHERE '2009-01-30' >= validfrom AND '2009-01-30' < validto;
Upvotes: 2