Reputation: 1695
I've got some sensory information going into a table. I have figured out the query that will tell me exactly when the value at a particular device changes.
What I need to know is the status of all of the other sensors at that time. The trick is, the timestamps won't be equal. I could get a data point from sensor 1, then 3 minute later, one from sensor 2, and then 30 seconds later, another from sensor 1.
So, here is an example of what I am talking about:
--- data_table ---
sensor | state | stime
-------+-------+---------------------
1 | A | 2014-08-17 21:42:00
1 | A | 2014-08-17 21:43:00
2 | B | 2014-08-17 21:44:00
3 | C | 2014-08-17 21:45:00
2 | D | 2014-08-17 21:46:00
3 | C | 2014-08-17 21:47:00
1 | B | 2014-08-17 21:48:00
3 | A | 2014-08-17 21:49:00
2 | D | 2014-08-17 21:50:00
2 | A | 2014-08-17 21:51:00
Now, I know the query that will deliver me the state changes. I've got this down, and it's in a view. That table would look like:
--- state_changed_view ---
sensor | state | stime
-------+-------+---------------------
2 | D | 2014-08-17 21:46:00
1 | B | 2014-08-17 21:48:00
3 | A | 2014-08-17 21:49:00
2 | A | 2014-08-17 21:51:00
What I want is a JOIN, where I can get all of the values of the 'state_changed_view', but also the values of the other corresponding sensors at the 'sensor_timestamp' within the view.
So, ideally, I want my result to look like (or something similar to):
sensor | state | stime | sensor | state | stime
-------+-------+---------------------+--------+-------+---------------------
2 | D | 2014-08-17 21:46:00 | 1 | A | 2014-08-17 21:43:00
2 | D | 2014-08-17 21:46:00 | 2 | D | 2014-08-17 21:46:00
2 | D | 2014-08-17 21:46:00 | 3 | C | 2014-08-17 21:45:00
1 | B | 2014-08-17 21:48:00 | 1 | B | 2014-08-17 21:48:00
1 | B | 2014-08-17 21:48:00 | 2 | D | 2014-08-17 21:46:00
1 | B | 2014-08-17 21:48:00 | 3 | C | 2014-08-17 21:47:00
3 | A | 2014-08-17 21:49:00 | 1 | B | 2014-08-17 21:48:00
3 | A | 2014-08-17 21:49:00 | 2 | D | 2014-08-17 21:46:00
3 | A | 2014-08-17 21:49:00 | 3 | A | 2014-08-17 21:49:00
2 | A | 2014-08-17 21:51:00 | 1 | B | 2014-08-17 21:48:00
2 | A | 2014-08-17 21:51:00 | 2 | A | 2014-08-17 21:51:00
2 | A | 2014-08-17 21:51:00 | 3 | A | 2014-08-17 21:49:00
As you can see, I need the most recent row in 'data_table' for each sensor, for every row that exists in state_changed_view
.
I just don't know how to get the SQL to get me the most recent row according to a particular timestamp.
This is on a PL/pgSQL system, so anything compatible with Postgres is handy.
Upvotes: 1
Views: 577
Reputation: 656706
For a small, given set of sensors to retrieve (this works for Postgres 8.4 or later):
SELECT c.sensor AS sensor_change
, d1.state AS state_1, d1.stime AS stime_1
, d2.state AS state_2, d2.stime AS stime_2
, d3.state AS state_3, d3.stime AS stime_3
FROM (
SELECT sensor, stime
, lag(state) OVER (PARTITION BY sensor ORDER BY stime)
<> state AS change
, max(CASE WHEN sensor = 1 THEN stime ELSE NULL END) OVER w AS last_1
, max(CASE WHEN sensor = 2 THEN stime ELSE NULL END) OVER w AS last_2
, max(CASE WHEN sensor = 3 THEN stime ELSE NULL END) OVER w AS last_3
FROM data d
WINDOW w AS (ORDER BY stime)
) c
JOIN data d1 ON d1.sensor = 1 AND d1.stime = c.last_1
JOIN data d2 ON d2.sensor = 2 AND d2.stime = c.last_2
JOIN data d3 ON d3.sensor = 3 AND d3.stime = c.last_3
WHERE c.change
ORDER BY c.stime;
Not using the view at all, building on the table directly, that's faster.
This is assuming a UNIQUE INDEX on (sensor, stime)
to be unambiguous. Performance also heavily depends on such an index.
As opposed to @Nick's solution, building on JOIN LATERAL
(Postgres 9.3+), this returns a single row with all values for every change in state.
Since you mentioned PL/pgSQL, I would expect this (highly optimized) plpgsql function to perform better, since it can make do with a single sequential scan of the table:
CREATE OR REPLACE FUNCTION f_sensor_change()
RETURNS TABLE (sensor_change int -- adapt to actual data types!
, state_1 "char", stime_1 timestamp
, state_2 "char", stime_2 timestamp
, state_3 "char", stime_3 timestamp) AS
$func$
DECLARE
r data%rowtype;
BEGIN
FOR r IN
TABLE data ORDER BY stime
LOOP
CASE r.sensor
WHEN 1 THEN
IF r.state = state_1 THEN -- just save stime
stime_1 := r.stime;
ELSIF r.state <> state_1 THEN -- save all & RETURN
stime_1 := r.stime; state_1 := r.state;
sensor_change := 1; RETURN NEXT;
ELSE -- still NULL: init
stime_1 := r.stime; state_1 := r.state;
END IF;
WHEN 2 THEN
IF r.state = state_2 THEN
stime_2 := r.stime;
ELSIF r.state <> state_2 THEN
stime_2 := r.stime; state_2 := r.state;
sensor_change := 2; RETURN NEXT;
ELSE
stime_2 := r.stime; state_2 := r.state;
END IF;
WHEN 3 THEN
IF r.state = state_3 THEN
stime_3 := r.stime;
ELSIF r.state <> state_3 THEN
stime_3 := r.stime; state_3 := r.state;
sensor_change := 3; RETURN NEXT;
ELSE
stime_3 := r.stime; state_3 := r.state;
END IF;
ELSE -- do nothing, ignore other sensors
END CASE;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM f_sensor_change();
Makes sense for repeated use. Related answer:
SQL Fiddle for Postgres 9.3.
SQL Fiddle for Postgres 8.4.
Upvotes: 3
Reputation: 21336
There are a couple of things making this not-so-straightforward:
state_changed_view
row, but the subquery must mention the corresponding stime
from the view (to restrict it to earlier records). Ordinary subqueries aren't allowed to depend on external fields, but you can accomplish this (as of Postgres 9.3, at least) with a lateral join.MAX(data_table.stime)
, but the corresponding data_table.state
. You could do this with another nested query to retrieve the rest of the row, but SELECT DISTINCT ON
gives you an easy way to fetch the whole thing at once.The end result is something like this:
SELECT *
FROM
state_changed_view,
LATERAL (
SELECT DISTINCT ON (sensor)
sensor,
state,
stime
FROM
data_table
WHERE
data_table.stime <= state_changed_view.stime
ORDER BY
sensor,
stime DESC
) a
Upvotes: 1
Reputation: 864
Start by finding the max time for each sensor and state with a subquery which groups on sensor and state and then join that to the view
SELECT *
FROM
(SELECT sensor, state, MAX(stime) as stime
from data_table
group by sensor, state) a
join state_changed_view on 1=1
Upvotes: 0