Reputation: 547
I want to express this:
SELECT
a.*
,b.timestamp_col
FROM weird_data_source a
LEFT JOIN weird_data_source b
ON a.id = b.id
AND b.timestamp_col = (
SELECT
MAX(sub.timestamp_col)
FROM weird_data_source sub
WHERE sub.id = a.id
AND sub.date_col <= a.date_col
AND sub.timestamp_col < a.timestamp_col
)
A couple notes here about the data:
date_col
and timestamp_col
aren't representing the same thing.But the subquery is invalid. Netezza cannot handle the <
operator in the correlated subquery. For the life of me I cannot figure out an alternative. How could I get around this?
My gut is telling me this could potentially be done with a join, but I haven't been able to be successful at this yet.
There's a dozen or so similar questions, but none of them seem to get at handling this type of inequality.
Upvotes: 1
Views: 490
Reputation: 47382
This should get you pretty close. You will get duplicate rows if there are two rows with the exact same timestamp_col
that otherwise meet the criteria, but otherwise you should be good:
SELECT
a.id,
a.some_other_columns, -- Because we NEVER use SELECT *
b.timestamp_col
FROM
weird_data_source a
LEFT JOIN weird_data_source b ON
a.id = b.id
LEFT OUTER JOIN weird_data_source c ON
c.id = a.id AND
c.date_col <= a.date_col AND
c.timestamp_col < a.timestamp_col
LEFT OUTER JOIN weird_data_source d ON
d.id = a.id AND
d.date_col <= a.date_col AND
d.timestamp_col < a.timestamp_col AND
d.timestamp_col > c.timestamp_col
WHERE
d.id IS NULL
The query is basically looking for a matching row where no other matching row is found with a greater timestamp_col
value - hence the d.id IS NULL
. That column will only be NULL
if no match is found.
Upvotes: 1