Mark Fox
Mark Fox

Reputation: 8924

Can you GROUP BY while extracting specific row values to selected columns

I will sometimes run into a situation where I have a parent/child relationship and I want to select all the children for each parent.

Here is a superficial example.

SELECT parent.id as parent_id, child.id as child_id
FROM parent LEFT JOIN child ON parent.id = child.parent_id
ORDER BY parent.id

Would yield

parent_id  child_id
1          20
1          21
2          33
2          67

I would like to do something like

SELECT parent.id, child1.id as child1_id, child2.id as child2_id
FROM parent LEFT JOIN child ON parent.id = child.parent_id
GROUP BY parent.id

And yield

parent_id  child1_id  child2_id
1          20         21
2          33         67

I'm trying to avoid using GROUP_CONCAT because I want to create separate columns for each child.

I realize I could join the child on twice and filter the selection, but given my actual dataset it could get hairy joining twice. Also, it'd be really cool if you could do this with an arbitrary number of children like:

parent_id  child1_id  child2_id  child3_id
1          20         21         null
2          33         67         109
3          45         null       null

Upvotes: 1

Views: 189

Answers (3)

sgeddes
sgeddes

Reputation: 62831

What you're looking for is called a PIVOT. MySQL does not support the PIVOT command, but you can simulate it by using MAX with CASE.

This is useful if you know the number of children or if you could have a maximum number.

SELECT parent.id as parent_id, 
  MAX(CASE WHEN rn = 1 THEN child.id END) child1_id,
  MAX(CASE WHEN rn = 2 THEN child.id END) child2_id,
  MAX(CASE WHEN rn = 3 THEN child.id END) child3_id,
  MAX(CASE WHEN rn = 4 THEN child.id END) child4_id,
  MAX(CASE WHEN rn = 5 THEN child.id END) child5_id
FROM parent 
  LEFT JOIN (
    SELECT *, 
      @rn:=IF(@prevParent=parent_id,@rn+1,1) rn,
      @prevParent:=parent_id
    FROM child JOIN (SELECT @rn:=0,@prevParent:=0) t
) child ON parent.id = child.parent_id
GROUP BY parent.id
ORDER BY parent.id;

You'll need to look into creating Dynamic SQL if you don't know the number of children/potential columns.

Here is an example:

SELECT
  GROUP_CONCAT(DISTINCT
    CONCAT(
      'MAX(IF(rn = ', rn, ',child.id,NULL)) AS child_id', rn)
  ) INTO @sql
FROM parent 
  LEFT JOIN (
    SELECT *, 
      @rn:=IF(@prevParent=parent_id,@rn+1,1) rn,
      @prevParent:=parent_id
    FROM child JOIN (SELECT @rn:=0,@prevParent:=0) t
) child ON parent.id = child.parent_id
;

SET @sql = CONCAT('SELECT parent.id as parent_id, 
                          ', @sql, ' 
                  FROM parent 
                    LEFT JOIN (
                      SELECT *, 
                        @rn:=IF(@prevParent=parent_id,@rn+1,1) rn,
                        @prevParent:=parent_id
                      FROM child JOIN (SELECT @rn:=0,@prevParent:=0) t
                  ) child ON parent.id = child.parent_id
                  GROUP BY parent.id
                  ORDER BY parent.id;');

PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269445

You can do this without variables and without dynamic SQL. The idea is to use a subquery and then to get the information about the columns.

To get all the children:

SELECT parent.id, group_concat(child.id) as children, count(*) as numchildren
FROM parent LEFT JOIN
     child ON parent.id = child.parent_id
GROUP BY parent.id

Now extract these, using substring_index() and reverse(). This is a complicated expression that returns the nth value from a list:

select parent_id,
       substring_index(children, ',', 1) as child1,
       reverse(substring_index(reverse(substring_index(children, ',', 2)), ',', 1)) as child2,
       reverse(substring_index(reverse(substring_index(children, ',', 3)), ',', 1)) as child3,
       reverse(substring_index(reverse(substring_index(children, ',', 4)), ',', 1)) as child4
from (SELECT parent.id, group_concat(child.id) as children, count(*) as numchildren
      FROM parent LEFT JOIN
           child ON parent.id = child.parent_id
      GROUP BY parent.id
     ) t

Upvotes: 1

Pieter Geerkens
Pieter Geerkens

Reputation: 11883

You would need to dynamically create the necessary SQ.

The SQL for your desired query would need to statically know the number of columns to create, and the column names to use. By creating the SQL query on the fly, and then running it, you could achieve the desired effect.

Upvotes: 0

Related Questions