Kung Fu Ninja
Kung Fu Ninja

Reputation: 3752

sql join including null and non existing records

I have the following tables. I am trying to get a join of these 4 tables where it will list all the questions, even if the user didn't answer anything along with user and category. any help?

select * 
from  tbl_category  c
inner join  tbl_questions  q on q.categoryID = c.categoryID
left join  tbl_answers  a on a.questionID = q.questionID
left join  tbl_users  u on u.userID = a.userID
order by u.userID

this will only give 9 records. i will need to know unanswered questions, and also users who didn't answer to any question.

CREATE TABLE [dbo].[tbl_users](
    [userID] [int] IDENTITY(1,1) NOT NULL,
    [firstName] [varchar](50) NULL,
    [lastName] [varchar](50) NULL
) 
GO

CREATE TABLE [dbo].[tbl_questions](
    [questionID] [int] IDENTITY(1,1) NOT NULL,
    [categoryID] [int] NOT NULL,
    [description] [varchar](100) NULL
) 
GO

CREATE TABLE [dbo].[tbl_category](
    [categoryID] [int] IDENTITY(1,1) NOT NULL,
    [description] [varchar](50) NULL
) 
GO

CREATE TABLE [dbo].[tbl_answers](
    [answer_id] [int] IDENTITY(1,1) NOT NULL,
    [answerText] [varchar](250) NULL,
    [questionID] [int] NULL,
    [userID] [int] NULL
) 
GO

-- tbl_questions
insert into tbl_questions ( categoryID, description ) 
values ( 1, 'How do you balance life and work?') 

insert into tbl_questions ( categoryID, description ) 
values ( 1, 'Do you check voicemail and email when on vacation?') 

insert into tbl_questions ( categoryID, description ) 
values ( 1, 'What is your favorite book?') 

insert into tbl_questions ( categoryID, description ) 
values ( 2, 'What were your responsibilities?') 

insert into tbl_questions ( categoryID, description ) 
values ( 2, 'What is your greatest strength?') 

insert into tbl_questions ( categoryID, description ) 
values ( 2, 'What is your greatest weakness?') 

insert into tbl_questions ( categoryID, description ) 
values ( 2, 'How do you evaluate success?') 

-- tbl_users
insert into tbl_users ( firstName, lastName ) 
values ( 'Alessandra', 'Ambrosio' )

insert into tbl_users ( firstName, lastName ) 
values ( 'Adriana', 'Lima' )

insert into tbl_users ( firstName, lastName ) 
values ( 'Daniela', 'Pestova' )


-- tbl_answers
insert into tbl_answers ( answerText, questionID, userID ) 
values ( 'answer for q1', 1, 1) 

insert into tbl_answers ( answerText, questionID, userID ) 
values ( 'answer for q2', 2, 1) 

insert into tbl_answers ( answerText, questionID, userID ) 
values ( 'answer for q3', 3, 1) 

insert into tbl_answers ( answerText, questionID, userID ) 
values ( 'answer for q4', 4, 1) 

insert into tbl_answers ( answerText, questionID, userID ) 
values ( 'answer for q5', 5, 1) 

insert into tbl_answers ( answerText, questionID, userID ) 
values ( 'answer for q6', 6, 1) 

insert into tbl_answers ( answerText, questionID, userID ) 
values ( 'answer for q7', 7, 1) 

insert into tbl_answers ( answerText, questionID, userID ) 
values ( 'other answer for q5', 5, 2) 

insert into tbl_answers ( answerText, questionID, userID ) 
values ( 'other answer for q2', 2, 2) 

-- tbl_category
INSERT tbl_category (categoryID, description) VALUES (1, 'About You')
INSERT tbl_category (categoryID, description) VALUES (2, 'Job') 

Upvotes: 1

Views: 521

Answers (3)

Andriy M
Andriy M

Reputation: 77667

i will need to know unanswered questions, and also users who didn't answer to any question.

You are looking for two distinct things. That means to me you need two different queries.

Unanswered questions are questions that have no matches in tbl_answers. This means tbl_answers anti-joined to tbl_questions:

SELECT
  Category = c.description,
  Question = q.description
FROM
  tbl_questions AS q
INNER JOIN
  tbl_category AS c ON q.categoryID = c.categoryID
WHERE NOT EXISTS (
  SELECT *
  FROM tbl_answers AS a
  WHERE q.questionID = a.questionID
);

And users who didn't answer any question are, again, those that have no matches in tbl_answers. So, an anti-join for this one too:

SELECT
  u.firstName,
  u.lastName
FROM
  tbl_users AS u
WHERE NOT EXISTS (
  SELECT *
  FROM tbl_answers AS a
  WHERE u.userID = a.userID
);

Upvotes: 2

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115530

You want all combinations of users and questions, even for those questions that a user did not answer. This kind of problem is somewhat complicated but not that uncommon.

The way to solve it to first create a cartesian product (CROSS JOIN) between questions and users - and only then LEFT JOIN to the answers.

So the query will be:

SELECT                           -- choose only the columns you want, not all (*)
    u.*, c.*, q.*, a.*     
FROM

questions are related to categories. We want that:

    tbl_category AS c
  JOIN 
    tbl_questions AS q  ON  q.categoryID = c.categoryID

then get all combinations of the above (questions) with (users)

  CROSS JOIN
    tbl_users AS u 

then join (answers). Notice how answers have 2 joining conditions, one for each of the above tables (questions and users):

  LEFT JOIN
    tbl_answers AS a  ON  a.questionID = q.questionID
                      AND a.userID = u.userID

The full query then becomes:

SELECT  
    u.*, c.*, q.*, a.*     
FROM
    tbl_category AS c
  JOIN 
    tbl_questions AS q  ON  q.categoryID = c.categoryID
  CROSS JOIN
    tbl_users AS u 
  LEFT JOIN
    tbl_answers AS a  ON  a.questionID = q.questionID
                      AND a.userID = u.userID
ORDER BY
    u.userID ;

Upvotes: 7

Oliver
Oliver

Reputation: 167

The user table needs to be the 'left' table as its currently defined. You could get into an argument about left and right joins but from a readability and personal preference left join is the way to go.

select * 
from tbl_users u 
left join (select q.questionID,q.categoryID,q.description qdesc,
                c.description catdesc,a.answer_id,a.answerText,a.userID
            from tbl_category  c
            inner join  tbl_questions  q on q.categoryID = c.categoryID
            left join  tbl_answers  a on a.questionID = q.questionID
            )qa on u.userid=qa.userid

If you need every question to show up for every user I think you will need another table that links user to question. Sometjhing like

CREATE TABLE [dbo].[Survey](
    [surveyID] [int] IDENTITY(1,1) NOT NULL,
    [questionID] [int] IDENTITY(1,1) NOT NULL,
     [userID] [int] NULL

Upvotes: -2

Related Questions