SoluableNonagon
SoluableNonagon

Reputation: 11755

SQL Left Join Multiple Tables and count values conditionally in each table

I am having some trouble putting together a SQL statement properly because I don't have much experience SQL, especially aggregate functions. Safe to say I don't really know what I'm doing outside of the basic SQL structure. I can do regular joins, but not complex ones.

I have some tables: 'Survey', 'Questions', 'Session', 'ParentSurvey', and 'ParentSurveyQuestion'. Structurally, a survey can have questions, it can have users that started the survey (a session), and it can have a parent survey whose questions get imported into the current survey.

What I want to do is get information for a each survey in the Survey table; total questions it has, how many sessions have been started (conditionally, ones that have not finished), and the number of questions in the parents survey. The three joined tables can but do not have to contain any values, and if they don't then 0 should be returned by COUNT. The common field in three of the tables is a variation of 'survey_id'

Here is my SQL so far, I put the table structure below it.

 SELECT 
   `kp_survey_id`,
   COALESCE( q.cnt, 0 ) AS questionsAmount,
   COALESCE( s.cnt, 0 ) AS sessionsAmount
   COALESCE( p.cnt, 0 ) AS parentQAmount,
 FROM `Survey`
   LEFT JOIN                    <-- I'd like the count of questions for this survey
      ( SELECT COUNT(*) AS cnt 
      FROM Questions
      GROUP BY kf_survey_id ) q
     ON Survey.kp_survey_id = Questions.kf_survey_id
   LEFT JOIN 
      ( SELECT COUNT(*) AS cnt    <-- I'd like the count of started sessions for this survey
      FROM Session
      WHERE session_status = 'started'  <-- should this be Session.session_status?
      GROUP BY kf_survey_id ) s
     ON Survey.kp_survey_id = Session.kf_survey_id
   LEFT JOIN            
      ( SELECT COUNT(*) AS cnt    <-- I'd like the count of questions in the parent survey with this survey id
      FROM ParentSurvey
      GROUP BY kp_parent_survey_id ) p
     ON Survey.kf_parent_survey_id = ParentSurveyQuestion.kf_parent_survey_id

'kp' prefix means primary key, while 'kf' prefix means foreign key
Structure:

Survey: 'kp_survey_id' | 'kf_parent_survey_id'

Question: 'kp_question_id' | 'kf_survey_id'

Session: 'kp_session_id' | 'kf_survey_id' | 'session_status'

ParentSurvey: 'kp_parent_survey_id' | 'survey_name'

ParentSurveyQuestion: 'kp_parent_question_id' | 'kf_parent_survey_id'

There are also other columns in each table like 'name' or 'account_id', but i don't think they matter in this case

I'd like to know if I'm doing this correctly or if I'm missing something. I'm repurposing some code I found here on stackoverflow and modifying it to meet my needs, as I haven't seen conditional aggregation for more than three tables on this site.

My expected output is something like:

kp_survey_id   |   questionsAmount   |   sessionsAmount   |   parentQAmount  
    1          |         3           |         0          |        3
    2          |         0           |         5          |        3

Upvotes: 0

Views: 2024

Answers (2)

sgeddes
sgeddes

Reputation: 62831

I think you were pretty close -- just need to fix your joins and include the survey id in the subqueries to use in those joins:

SELECT 
   `kp_survey_id`,
   COALESCE( q.cnt, 0 ) AS questionsAmount,
   COALESCE( s.cnt, 0 ) AS sessionsAmount
   COALESCE( p.cnt, 0 ) AS parentQAmount,
 FROM `Survey`
   LEFT JOIN                    
      ( SELECT COUNT(*) cnt, kf_survey_id AS cnt 
      FROM Questions
      GROUP BY kf_survey_id ) q
     ON Survey.kp_survey_id = q.kf_survey_id
   LEFT JOIN 
      ( SELECT COUNT(*) cnt, kf_survey_id
      FROM Session
      WHERE session_status = 'started'  
      GROUP BY kf_survey_id ) s
     ON Survey.kp_survey_id = s.kf_survey_id
   LEFT JOIN            
      ( SELECT COUNT(*) cnt, kp_parent_survey_id
      FROM ParentSurvey
      GROUP BY kp_parent_survey_id ) p
     ON Survey.kf_parent_survey_id = p.kp_parent_survey_id

Upvotes: 1

Tom
Tom

Reputation: 6663

One thing you need to do is correct your joins. When you are joining to a subquery, you need to use the alias of the subquery. In your case you are using the alias of the table being used in the subquery.

Another thing you need to change is to include the field you wish to use in your JOIN in the subquery.

Make these changes and try running. Do you get an error or the desired results?

SELECT 
   `kp_survey_id`,
   COALESCE( q.cnt, 0 ) AS questionsAmount,
   COALESCE( s.cnt, 0 ) AS sessionsAmount
   COALESCE( p.cnt, 0 ) AS parentQAmount,
 FROM `Survey`
   LEFT JOIN                    <-- I'd like the count of questions for this survey
      ( SELECT kf_survey_id, COUNT(*) AS cnt 
      FROM Questions
      GROUP BY kf_survey_id ) q
     ON Survey.kp_survey_id = q.kf_survey_id
   LEFT JOIN 
      ( SELECT kf_survey_id, COUNT(*) AS cnt    <-- I'd like the count of started sessions for this survey
      FROM Session
      WHERE session_status = 'started'  <-- should this be Session.session_status?
      GROUP BY kf_survey_id ) s
     ON Survey.kp_survey_id = s.kf_survey_id
   LEFT JOIN            
      ( SELECT kp_parent_survey_id, COUNT(*) AS cnt    <-- I'd like the count of questions in the parent survey with this survey id
      FROM ParentSurvey
      GROUP BY kp_parent_survey_id ) p
     ON Survey.kf_parent_survey_id = p.kf_parent_survey_id

Upvotes: 1

Related Questions