xtine.k
xtine.k

Reputation: 39

MySQL union merge duplicate rows in two tables

I am looking to combine two tables where Table A (multiple_choice) contains the majority of survey answers and Table B (free) contains only text response answers. However, there are entries in Table A and B for certain questions.

My current method is using UNION ALL in order to merge the data in the two tables but includes two rows for what I want to be one row.

Is there a way to take the duplicates based on question id from both tables and merge the rows where have table a has a value in column multianswer and table b has a value in column response?

Here is my statement:

SELECT sId, qId, group_concat(multianswer), response 
FROM multiple_choice
GROUP BY sId, qId

UNION ALL

SELECT sId, qId, '' as multianswer, response 
FROM text_response 
GROUP BY sId

Tables:

Table A multiple_choice
sId   qId  multianswer  response
1001  1    1
1001  2    3
1001  2    4
1001  2    5                                
1001  3    6            college
1001  5    1

Table B text_response
sId   qid  response
1001  1    [email protected]
1001  4    it is of no use to me
1001  5    another other response

Desired Result:

sId   qid  multianswer  response
1001  1    1           [email protected]
1001  2    3,4,5
1001  3    6           college
1001  4                it is of no use to me
1001  5    1           another other response

Code Result:

sId   qid  multianswer  response
1001  1    1
1001  1                [email protected]
1001  2    3,4,5
1001  3    6           college
1001  4                it is of no use to me
1001  5    1
1001  5                another other response   

Upvotes: 1

Views: 2788

Answers (2)

With the operator ALL the duplicate removal doesn't work. Just use UNION.

For more information read: https://dev.mysql.com/doc/refman/5.7/en/union.html

Upvotes: 0

mable
mable

Reputation: 144

You need to keep the "ALL" in the union in order to keep both rows for that sId, qId grouping. Looking at your data, for the row pairs in question, multianswer/response is either something or nothing, and '[email protected]' > null. Without max(), it will automatically get the value for those columns from the first row in the grouping.

So, this may or may not be the best way to do it, but should get the job done. Wrap your existing query with another that handles the grouping.

SELECT sId, qId, max(multianswer) as multianswer, max(response) as response
FROM (
     SELECT sId, qId, group_concat(multianswer) as multianswer, response 
     FROM multiple_choice
     GROUP BY sId, qId
     UNION ALL
     SELECT sId, qId, '' as multianswer, response 
     FROM text_response 
     GROUP BY sId
) fobar
GROUP BY sId, qId

Upvotes: 2

Related Questions