fdkgfosfskjdlsjdlkfsf
fdkgfosfskjdlsjdlkfsf

Reputation: 3301

TSQL query returning double the rows with inner join of same table twice

I'm trying to return all middle/highschool courses taught by a given teacher. With an inner join of two tables, 3 rows are displayed correctly. When I do 2nd inner join with the 3rd table, it returns 6 rows instead of 3.

Without using cte, DISTINCT, how can I display 3 rows with empid, the middle classes, and the high school classes? Also, both foreign tables should do joins with the primary table.

IF OBJECT_ID('tempdb..#empl') IS NOT NULL DROP TABLE #empl
IF OBJECT_ID('tempdb..#middlecourses') IS NOT NULL DROP TABLE #middlecourses
IF OBJECT_ID('tempdb..#highcourses') IS NOT NULL DROP TABLE #highcourses

create table #empl
(
    EmpId int,
    Grade int
)
insert into #empl select 1, 5   

create table #middlecourses
(
    EmpId int,
    Grade int,
    Course varchar(20)
)
insert into #middlecourses select 1, 5, 'Science'
insert into #middlecourses select 1, 5, 'Math'  
insert into #middlecourses select 1, 5, 'English'

create table #highcourses
(
    EmpId int,
    Grade int,
    Course varchar(20)
)
insert into #highcourses select 1, 5, 'Calculus'
insert into #highcourses select 1, 5, 'Physics' 
insert into #highcourses select 1, 5, 'CompSci'

select e.empid, e.grade, m.course as 'MiddleCourse'
from #empl e inner join #middlecourses m
on e.empid = m.empid 
and e.grade = m.grade

select e.empid, e.grade, m.course as 'MiddleCourse', h.course as 'HighCourse'
from #empl e inner join #middlecourses m
on e.empid = m.empid 
and e.grade = m.grade
inner join #highcourses h
on e.empid = h.empid
and e.grade = h.grade

drop table #empl
drop table #middlecourses
drop table #highcourses

Upvotes: 9

Views: 4385

Answers (5)

Mukund
Mukund

Reputation: 1689

First of all you need to understand how inner join works. Inner join will get you the record which are present in both the tables which you are joining.

coming to your question, when you execute the following query

select e.empid, e.grade, m.course as 'MiddleCourse'
from #empl e inner join #middlecourses m
on e.empid = m.empid 
and e.grade = m.grade

you will get this record.

empid   grade   MiddleCourse
1   5   Science
1   5   Math
1   5   English

so you get 3 records which is expected, because there are 3 records in #middlecourse table with empid = 1. So it inner join is work like this. It will pick one empid at a time from #empl table and will try to find the matching rows for this empiid in the second table i.e #middlecourses

So you have 3 records from above query. Now when you will add the second inner join then it will try to get empid from above 3 record and will match with the third table which is #highcoures. So for every empiid it will return 3 record. so total you will have 3*3 = 9 record from second query like this.

EmpId   Grade   EmpId   Grade   Course  EmpId   Grade   Course
1        5      1       5       Science  1      5      Calculus
1        5      1       5       Math     1      5      Calculus
1        5      1       5       English  1      5      Calculus
1        5      1       5       Science  1      5      Physics
1        5      1       5       Math     1      5      Physics
1        5      1       5       English  1      5      Physics
1        5      1       5       Science  1      5      CompSci
1        5      1       5       Math     1      5      CompSci
1        5      1       5       English  1      5      CompSci

One solution for this scenario would be do union on your courses table and then do the inner join with #empl table.

select e.EmpId, e.Grade, a.Course from #empl e
inner join (
select * 
from #middlecourses
union
select * from #highcourses) a on e.EmpId = a.EmpId

Upvotes: 0

Giorgos Betsos
Giorgos Betsos

Reputation: 72185

You can use ROW_NUMBER() to match middlecourses with highcourses depending on alphabetical course ordering:

select e.empid, e.grade, m.course as 'MiddleCourse', h.course as 'HighCourse'
from #empl e 
cross apply (
   SELECT course, ROW_NUMBER() over (order by course) as rn
   FROM #middlecourses m
   WHERE e.empid = m.empid AND e.grade = m.grade ) m
cross apply (
   SELECT course, ROW_NUMBER() over (order by course) as rn
   FROM #highcourses h
   WHERE e.empid = h.empid AND e.grade = h.grade ) h
where m.rn = h.rn

Output:

empid   grade   MiddleCourse    HighCourse
-------------------------------------------
1       5       English         Calculus
1       5       Math            CompSci
1       5       Science         Physics

The above will only work in case there is an equal number of middlecourses and highcourses.

In case there is a mismatch between the number of middlecourses and highcourses you can use a bit more complicated variation of the above query:

SELECT e.EmpId, e.Grade, t.MiddleCourse, t.HighCourse
FROM #empl e
INNER JOIN (
   SELECT COALESCE(m.empid, h.empid) AS empid, 
          COALESCE(m.grade, h.grade) AS grade,
          m.Course AS 'MiddleCourse', h.Course as 'HighCourse'
   FROM (SELECT empid, grade, course, 
                ROW_NUMBER() over (partition by empid, grade 
                                   order by course) as rn
         FROM #middlecourses) m
   FULL JOIN (SELECT empid, grade, course, 
                     ROW_NUMBER() over (partition by empid, grade 
                                        order by course) as rn
              FROM #highcourses) h 
   ON m.EmpId = h.EmpId AND m.Grade = h.Grade AND m.rn = h.rn ) t
ON e.EmpId = t.empid AND e.Grade = t.grade

With one more record in highcourses:

insert into #highcourses select 1, 5, 'Algebra'

output is:

EmpId   Grade   MiddleCourse    HighCourse
-------------------------------------------
1       5       English         Algebra
1       5       Math            Calculus
1       5       Science         CompSci
1       5       NULL            Physics

Upvotes: 0

jorgesalvador
jorgesalvador

Reputation: 71

There might be a more elegant solution, but this should do the trick for the scenario given:

select e.empid, e.grade, c.course, c.CourseType
from #empl e 
inner join 
(
SELECT *, 'MiddleCourse' AS CourseType
FROM #middlecourses m 
UNION ALL
SELECT *, 'HighCourse' AS CourseType
FROM #highcourses h
) c ON c.EmpId = e.EmpId AND c.Grade = e.Grade

Upvotes: 2

Joeri
Joeri

Reputation: 225

Yes, edmondson is correct. What you could do is a simple pivot with a ROW_NUMBER() to make the rows unique.

select
    *
from
(
    select e.empid, e.grade, 'MiddleCourses' as [Type] , m.course, ROW_NUMBER() OVER (ORDER BY e.EmpId) ClassNo
    from #empl e inner join #middlecourses m
    on e.empid = m.empid 
    and e.grade = m.grade
    union all
    select e.empid, e.grade, 'HighCourses' as [Type] ,m.course, ROW_NUMBER() OVER (ORDER BY e.EmpId) ClassNo
    from #empl e inner join #highcourses m
    on e.empid = m.empid 
    and e.grade = m.grade
) SourceTable
pivot
(
    MIN(Course)
    FOR [Type] IN (MiddleCourses,HighCourses)
) pivotTable

Upvotes: 0

m.edmondson
m.edmondson

Reputation: 30922

This is because all of your empid and grade are the same. This join is matching many times.

You're already seeing this in your first join, the rows in #Empl are being duplicated three times (becuase it matches all 3 records in #MiddleCourses).

To reduce these you need to use a more exlusive join and / or use differing data. Try changing the empid and grade and you'll hopefully see what I mean.

Upvotes: 0

Related Questions