aruk
aruk

Reputation: 1

Need to make SQL subquery more efficient

I have a table that contains all the pupils.

I need to look through my registered table and find all students and see what their current status is.

If it's reg = y then include this in the search, however student may change from y to n so I need it to be the most recent using start_date to determine the most recent reg status.

The next step is that if n, then don't pass it through. However if latest reg is = y then search the pupil table, using pupilnumber; if that pupil number is in the pupils table then add to count.

Select Count(*) 
From Pupils Partition(Pupils_01) 
Where Pupilnumber in (Select t1.pupilnumber 
                      From registered t1
                      Where T1.Start_Date = (Select Max(T2.Start_Date)
                                             From registered T2
                                             Where T2.Pupilnumber = T1.Pupilnumber)
                        And T1.reg = 'N');

This query works, but it is very slow as there are several records in the pupils table.

Just wondering if there is any way of making it more efficient

Upvotes: 0

Views: 55

Answers (3)

Justin Cave
Justin Cave

Reputation: 231661

Worrying about query performance but not indexing your tables is, well, looking for a kind word here... ummm... daft. That's the whole point of indexes. Any variation on the query is going to be much slower than it needs to be.

I'd guess that using analytic functions would be the most efficient approach since it avoids the need to hit the table twice.

SELECT COUNT(*) 
  FROM( SELECT pupilnumber, 
               startDate,
               reg,
               rank() over (partition by pupilnumber order by startDate desc) rnk
          FROM registered )
 WHERE rnk = 1
   AND reg = 'Y'

Upvotes: 1

DRapp
DRapp

Reputation: 48139

This query MIGHT be more efficient for you and hope at a minimum you have indexes per "pupilnumber" in the respective tables.

To clarify what I am doing, the first inner query is a join between the registered table and the pupil which pre-qualifies that they DO Exist in the pupil table... You can always re-add the "partition" reference if that helps. From that, it is grabbing both the pupil AND their max date so it is not doing a correlated subquery for every student... get all students and their max date first...

THEN, join that result to the registration table... again by the pupil AND the max date being the same and qualify the final registration status as YES. This should give you the count you need.

select
      count(*) as RegisteredPupils
   from
      ( select 
              t2.pupilnumber,
              max( t2.Start_Date ) as MostRecentReg
           from
              registered t2
                 join Pupils p
                    on t2.pupilnumber = p.pupilnumber
           group by
              t2.pupilnumber ) as MaxPerPupil
         JOIN registered t1
            on MaxPerPupil.pupilNumber = t1.pupilNumber
           AND MaxPerPupil.MostRecentRec = t1.Start_Date
           AND t1.Reg = 'Y'

Note: If you have multiple records in the registration table, such as a person taking multiple classes registered on the same date, then you COULD get a false count. If that might be the case, you could change from

COUNT(*)

to

COUNT( DISTINCT T1.PupilNumber )

Upvotes: 0

umut özkan
umut özkan

Reputation: 132

You can look execution plan for this query. It will show you high cost operations. If you see table scan in execution plan you should index them. Also you can try "exists" instead of "in".

Upvotes: 0

Related Questions