Reputation: 12485
I have some data in a Postgres table that looks like this:
1 apple datetime1
2 orange datetime2
3 apple datetime3
4 orange datetime4
5 apple datetime5
6 orange datetime6
.
The datetime is always in ascending order and the majority of times the apple rows are inserted first and orange second with some exceptions that I have to catch and eliminate.
What I practically need is a Postgres query that will pair apples and oranges only:
1 apple datetime1 2 orange datetime2
3 apple datetime3 4 orange datetime4
5 apple datetime5 6 orange datetime6
Apples should never be paired with other apples and oranges should never be paired with other oranges.
There are couple of conditions:
1) In the newly generated rows apple should always be first and orange second.
2) Always pair apple and orange rows with the closest datetimes and ignore the other rows.
For example if I have the original data looking like this:
1 apple datetime1
2 apple datetime2
3 orange datetime3
4 orange datetime4
pair
2 apple datetime2 3 orange datetime3
and ignore rows
1 apple datetime1
4 orange datetime4
Any ideas how to do this in Postgres?
Upvotes: 1
Views: 191
Reputation: 656321
Solution with CTE & window function:
WITH x AS (
SELECT *
,lead(tbl) OVER (ORDER BY id) AS y
FROM tbl
)
SELECT x.id, x.fruit, x.dt
, (y).id, (y).fruit, (y).dt
FROM x
WHERE fruit = 'apple'
AND (y).fruit = 'orange'
ORDER BY x.id;
Could be done as subquery just as well, but per request from @wildplasser. :)
I pick the whole "next" row per window function lead()
. Note the syntax with parenthesis to access columns of the composite (row) type y
.
Upvotes: 1
Reputation: 52107
The datetime is always in ascending order and the majority of times the apple rows are inserted first and orange second with some exceptions that I have to catch and eliminate.
If I understand correctly, you want to find if there are any two consecutive rows with the same fruit, right?
If so, you can do it like this:
WITH Q AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY datetime) R
FROM YOUR_TABLE
)
SELECT *
FROM Q Q1 JOIN Q Q2 ON Q1.R = Q2.R - 1
WHERE Q1.fruit = Q2.fruit;
In plain English: join every row with the next one (where the meaning of "next" is determined by the datetime
ordering) and check if their fruits match. You can play with it in this SQL Fiddle.
BTW, this is not a good database design. You should consider a redesign so "invalid" situation you described above is naturally prevented by the database structure itself.
Upvotes: 2
Reputation: 44230
SET search_path='lutser';
DROP TABLE fruits;
CREATE TABLE fruits
( id INTEGER NOT NULL
, fruit varchar
, zdate varchar
);
INSERT INTO fruits(id,fruit,zdate)
VALUES
(1, 'apple', 'datetime01')
, (2, 'orange', 'datetime02')
, (3, 'apple', 'datetime03')
, (4, 'orange', 'datetime04')
, (5, 'apple', 'datetime05')
, (6, 'orange', 'datetime06')
, (11, 'apple', 'datetime11')
, (12, 'apple', 'datetime12')
, (13, 'orange', 'datetime13')
, (14, 'orange', 'datetime14')
;
SELECT fa.id, fa.fruit, fa.zdate
, fo.id, fo.fruit, fo.zdate
FROM fruits fa
JOIN fruits fo ON fa.zdate < fo.zdate
WHERE fa.fruit = 'apple' AND fo.fruit = 'orange'
AND NOT EXISTS (
SELECT *
FROM fruits nx
WHERE nx.fruit = 'orange'
AND nx.zdate > fa.zdate
AND nx.zdate < fo.zdate
)
AND NOT EXISTS (
SELECT *
FROM fruits nx
WHERE nx.fruit = 'apple'
AND nx.zdate < fo.zdate
AND nx.zdate > fa.zdate
)
;
Result:
DROP TABLE
CREATE TABLE
INSERT 0 10
id | fruit | zdate | id | fruit | zdate
----+-------+------------+----+--------+------------
1 | apple | datetime01 | 2 | orange | datetime02
3 | apple | datetime03 | 4 | orange | datetime04
5 | apple | datetime05 | 6 | orange | datetime06
12 | apple | datetime12 | 13 | orange | datetime13
(4 rows)
Upvotes: 0
Reputation: 125204
select
t0.id, t0.fruit, t0.datetime,
t1.id, t1.fruit, t1.datetime
from t t0
inner join t t1 on
t0.fruit = 'apple'
and
t1.fruit = 'orange'
and
t0.datetime < t1.datetime
order by t1.datetime - t0.datetime
limit 1
Upvotes: 0