avatar
avatar

Reputation: 12485

How can I generate a new row out of two other rows in Postgres?

I have some data in a Postgres table that looks like this:

1 apple datetime1
2 orange datetime2
3 apple datetime3
4 orange datetime4
5 apple datetime5
6 orange datetime6
.

The datetime is always in ascending order and the majority of times the apple rows are inserted first and orange second with some exceptions that I have to catch and eliminate.

What I practically need is a Postgres query that will pair apples and oranges only:

1 apple datetime1 2 orange datetime2
3 apple datetime3 4 orange datetime4
5 apple datetime5 6 orange datetime6

Apples should never be paired with other apples and oranges should never be paired with other oranges.

There are couple of conditions:

1) In the newly generated rows apple should always be first and orange second.

2) Always pair apple and orange rows with the closest datetimes and ignore the other rows.

For example if I have the original data looking like this:

1 apple datetime1
2 apple datetime2
3 orange datetime3
4 orange datetime4

pair

2 apple datetime2 3 orange datetime3

and ignore rows

1 apple datetime1
4 orange datetime4

Any ideas how to do this in Postgres?

Upvotes: 1

Views: 191

Answers (4)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656321

Solution with CTE & window function:

WITH x AS (
    SELECT *
          ,lead(tbl) OVER (ORDER BY id) AS y
    FROM tbl
    )
SELECT x.id,     x.fruit,   x.dt
     , (y).id, (y).fruit, (y).dt
FROM   x
WHERE  fruit = 'apple'
AND    (y).fruit = 'orange'
ORDER  BY x.id;

Could be done as subquery just as well, but per request from @wildplasser. :)

I pick the whole "next" row per window function lead(). Note the syntax with parenthesis to access columns of the composite (row) type y.

Upvotes: 1

Branko Dimitrijevic
Branko Dimitrijevic

Reputation: 52107

The datetime is always in ascending order and the majority of times the apple rows are inserted first and orange second with some exceptions that I have to catch and eliminate.

If I understand correctly, you want to find if there are any two consecutive rows with the same fruit, right?

If so, you can do it like this:

WITH Q AS (
    SELECT *, ROW_NUMBER() OVER (ORDER BY datetime) R
    FROM YOUR_TABLE
)
SELECT *
FROM Q Q1 JOIN Q Q2 ON Q1.R = Q2.R - 1
WHERE Q1.fruit = Q2.fruit;

In plain English: join every row with the next one (where the meaning of "next" is determined by the datetime ordering) and check if their fruits match. You can play with it in this SQL Fiddle.

BTW, this is not a good database design. You should consider a redesign so "invalid" situation you described above is naturally prevented by the database structure itself.

Upvotes: 2

wildplasser
wildplasser

Reputation: 44230

SET search_path='lutser';

DROP TABLE fruits;
CREATE TABLE fruits
    ( id INTEGER NOT NULL
    , fruit varchar
    , zdate varchar
    );  
INSERT INTO fruits(id,fruit,zdate)
VALUES

(1, 'apple', 'datetime01')
, (2, 'orange', 'datetime02')
, (3, 'apple', 'datetime03')
, (4, 'orange', 'datetime04')
, (5, 'apple', 'datetime05')
, (6, 'orange', 'datetime06')
, (11, 'apple', 'datetime11')
, (12, 'apple', 'datetime12')
, (13, 'orange', 'datetime13')
, (14, 'orange', 'datetime14')
    ;

SELECT fa.id, fa.fruit, fa.zdate
    , fo.id, fo.fruit, fo.zdate
FROM fruits fa
JOIN fruits fo ON fa.zdate < fo.zdate
WHERE fa.fruit = 'apple' AND fo.fruit = 'orange'
AND NOT EXISTS (
    SELECT *
    FROM fruits nx
    WHERE nx.fruit = 'orange'
    AND nx.zdate > fa.zdate
    AND nx.zdate < fo.zdate
    )   
AND NOT EXISTS (
    SELECT *
    FROM fruits nx
    WHERE nx.fruit = 'apple'
    AND nx.zdate < fo.zdate
    AND nx.zdate > fa.zdate
    )   
    ;

Result:

DROP TABLE
CREATE TABLE
INSERT 0 10
 id | fruit |   zdate    | id | fruit  |   zdate    
----+-------+------------+----+--------+------------
  1 | apple | datetime01 |  2 | orange | datetime02
  3 | apple | datetime03 |  4 | orange | datetime04
  5 | apple | datetime05 |  6 | orange | datetime06
 12 | apple | datetime12 | 13 | orange | datetime13
(4 rows)

Upvotes: 0

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125204

select 
    t0.id, t0.fruit, t0.datetime, 
    t1.id, t1.fruit, t1.datetime
from t t0
inner join t t1 on 
    t0.fruit = 'apple' 
    and 
    t1.fruit = 'orange'
    and
    t0.datetime < t1.datetime
order by t1.datetime - t0.datetime
limit 1

Upvotes: 0

Related Questions