Julian
Julian

Reputation: 393

postgreSQL Duplicate rows counting on join

I have one complicated question. I'll try to explain it with example:

have one table that have primary key, and I want to join other table there the first's table primary key is foreign key, and I want If in the second table there is duplicate foreign key to select the number of repeatability. For example:

1-st table:

  id    name 
  ---  -----
  1     Greg
  2     Alan 
  3    George 
  4     John 
  5     Peter   

2-nd table

 id       aid        data   
 ---     -----      -------
 1        2          CCCV   
 2        2          VVVV 
 3        3          DDDDD 
 4        3          SSSS 
 5        4          PPPPP 

I want the result of the join to be:

 id(1st table)  aid   name    Data   Number
 -----------    ----  -----   -----  -----
 1               null  Greg    null   1
 2                1    Alan    CCCV   1
 2                2    Alan    VVVV   2
 3                3    George  DDDDD  1
 3                4    George  SSSS   2
 4                5    John    PPPPP  1
 5               null  Peter   null   1

I searched a lot, I couldn't find anything. Maybe I do not know how search, or there is no such thing as what I want to do.

Upvotes: 4

Views: 3041

Answers (3)

MatBailie
MatBailie

Reputation: 86706

As per my comment, you've tagged this both MySQL and PostgreSQL.

This answer is for PostgreSQL.

SELECT
  table1.id,
  table2.aid,
  table1.name,
  table2.data,
  ROW_NUMBER() OVER (PARTITION BY table1.id ORDER BY table2.aid) AS number
FROM
  table1
LEFT JOIN
  table2
    ON table1.id = table2.aid

Upvotes: 2

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656391

Queries for PostgreSQL 8.3 which has no window functions.
With bigger tables it is regularly much faster to use a JOIN instead of a correlated sub-query.
The first query aggregates values for Table2 before joining to Table1, which should befaster, too:

SELECT t1.id, t2.aid, t1.name, t2.data, COALESCE(t2.ct, 1) AS number
FROM   Table1 t1
LEFT   JOIN  (
    SELECT x.aid, x.data, count(y.aid) + 1 AS ct
    FROM   Table2 x
    LEFT   JOIN Table2 y ON x.aid = y.aid AND x.id > y.id
    GROUP  BY x.aid, x.data
    ) t2 ON t2.aid = t1.id
ORDER  BY t1.id, t2.ct;

And ORDER BY should be fixed.

Alternative without sub-query. Might be faster, yet:

SELECT t1.id, t2.aid, t1.name, t2.data, count(*) + count(t3.id) AS number
FROM   Table1 t1
LEFT   JOIN Table2 t2 ON t2.aid = t1.id
LEFT   JOIN Table2 t3 ON t3.aid = t2.aid AND t3.id < t2.id
GROUP  BY t1.id, t2.aid, t1.name, t2.data
ORDER  BY t1.id, count(t3.id);

Not sure, didn't test with a bigger set. Test performance with EXPLAIN ANALYZE. Could you report back your results?

Upvotes: 0

Terje D.
Terje D.

Reputation: 6315

SELECT Table1.id, Table2.id as aid, Table1.name, Table2.data,
GREATEST(1, (SELECT COUNT(*)
             FROM Table2 t2
             WHERE t2.aid = Table1.id
             AND t2.id <= Table2.id))
AS number
FROM Table1
LEFT JOIN Table2
ON Table2.aid = Table1.id
ORDER BY id, aid;

works in both MySQL and PostgreSQL.

Upvotes: 2

Related Questions