phew
phew

Reputation: 836

How to join multiple tables using two aggregate functions without "duplicating" aggregated data?

I am trying to display one tune (song) per row and all data related to that specific tune's tune_id, if there is more entries for a field related to the tune_id I want the data aggregated and concatenated by ' & '. However, I can not figure out why string_agg(artist, ' & ') is duplicating the artist's name for every producer entry the tune_id has. Furthermore the string_agg(producer, ' & ') function does not display the desired result either. I assume I am incorrectly using GROUP BY or both of the aggregate functions.

I am currently running PostgreSQL 9.1.16, though it will be upgraded to latest version in a few days.

Here is a SQLFiddle with everything set up like described below.

The database structure is:

CREATE TABLE tunes (
    tune_id               SERIAL    PRIMARY KEY,
    tune                  TEXT      NOT NULL
);

CREATE TABLE artistnames (
    artistname_id         SERIAL    PRIMARY KEY,
    artist                TEXT      UNIQUE NOT NULL
);

CREATE TABLE artists (
   artist_id              SERIAL    PRIMARY KEY,
   artistname_id          SERIAL    REFERENCES artistnames (artistname_id),
   tune_id                SERIAL    REFERENCES tunes (tune_id)
);

CREATE TABLE producernames (
    producername_id       SERIAL      PRIMARY KEY,
    producer              TEXT        UNIQUE NOT NULL
);

CREATE TABLE producers (
    producer_id           SERIAL      PRIMARY KEY,
    producername_id       SERIAL      REFERENCES producernames (producername_id),
    tune_id               SERIAL      REFERENCES tunes (tune_id)
);

The dummy data to test the database with is:

TRUNCATE TABLE tunes, artistnames, artists, producernames, producers CASCADE;

ALTER SEQUENCE tunes_tune_id_seq RESTART WITH 1;
ALTER SEQUENCE artistnames_artistname_id_seq RESTART WITH 1;
ALTER SEQUENCE artists_artist_id_seq RESTART WITH 1;
ALTER SEQUENCE producernames_producername_id_seq RESTART WITH 1;
ALTER SEQUENCE producers_producer_id_seq RESTART WITH 1;

INSERT INTO tunes (tune)
  VALUES ('Only Takes Love'), ('Some Boy');

INSERT INTO artistnames (artist) 
  VALUES ('Sizzla'), ('Tanto Metro'), ('Devonte');

INSERT INTO artists (artistname_id, tune_id)
  VALUES (1, 1), (2, 2), (3, 2);

INSERT INTO producernames (producer)
  VALUES ('Brad "Riprock" Daymond'), ('Alexander Greggs'), ('Dennis');

INSERT INTO producers (producername_id, tune_id)
  VALUES (1, 1), (2, 1), (3, 1), (1, 2), (3, 2);

The query I am working with looks like:

SELECT
    string_agg(artist, ' & ') AS artist,
    tune,
    string_agg(producer, ' & ') AS producer
FROM tunes
    FULL JOIN artists USING (tune_id)
    FULL JOIN artistnames USING (artistname_id)
    FULL JOIN producers USING (tune_id)
    FULL JOIN producernames USING (producername_id)
GROUP BY tune;

The result returned by this query looks like:

+-----------------------------------------------+-----------------+-------------------------------------------------------------------+
| artist                                        | tune            | producer                                                          |
+-----------------------------------------------+-----------------+-------------------------------------------------------------------+
| Sizzla & Sizzla & Sizzla                      | Only Takes Love | Brad "Riprock" Daymond & Alexander Greggs & Dennis                |
+-----------------------------------------------+-----------------+-------------------------------------------------------------------+
| Tanto Metro & Tanto Metro & Devonte & Devonte | Some Boy        | Brad "Riprock" Daymond & Dennis & Brad "Riprock" Daymond & Dennis |
+-----------------------------------------------+-----------------+-------------------------------------------------------------------+

Here's the result that I am trying to achieve to receive:

+-----------------------+-----------------+----------------------------------------------------+
| artist                | tune            | producer                                           |
+-----------------------+-----------------+----------------------------------------------------+
| Sizzla                | Only Takes Love | Brad "Riprock" Daymond & Alexander Greggs & Dennis |
+-----------------------+-----------------+----------------------------------------------------+
| Tanto Metro & Devonte | Some Boy        | Brad "Riprock" Daymond & Dennis                    |
+-----------------------+-----------------+----------------------------------------------------+

EDIT: I'm not quite sure if the title/question fits the problem described, if someone has a better title/question suggestion fitting my problem, let me know.

Upvotes: 0

Views: 153

Answers (1)

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125564

SQL Fiddle

SELECT
    string_agg(distinct(artist), ' & ') AS artist,
    tune,
    string_agg(distinct(producer), ' & ') AS producer
FROM 
    tunes
    left join (
        artists 
        left JOIN 
        artistnames USING (artistname_id)
    ) a USING (tune_id)
    left JOIN (
        producers
        left JOIN 
        producernames USING (producername_id)
    ) b USING (tune_id)
GROUP BY tune;

Upvotes: 1

Related Questions