Reputation: 537
Currently I am working on a project where I need to extract latest data from a table for a report purpose. Below is the sample table structure:-
Every student has several courses and course_id for programming language is +ve and for non programming language -ve. I want to extract latest programming language and non programming language course_id for each and every student.
I use the below SQL query and able to extract the data.
CREATE TABLE COURSE ("STUDENT_ID" int, "COURSE_ID" int, "COURSE_NAME" varchar2(31), "COURSE_START_DATE" timestamp) ; INSERT ALL INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100001, -100, 'C Programming Language', '04-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100001, -200, 'Java Programming Language', '11-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100001, -300, 'C# Programming Language', '07-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100001, 100, 'Data Structure and algorithms', '05-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100001, 200, 'Computer Graphics', '13-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100001, 300, 'Networking', '02-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100002, -300, 'C# Programming Language', '12-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100002, -400, 'Python Programming Language', '07-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100002, -500, 'JavaScript Programming Language', '08-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100002, 100, 'Data Structure and algorithms', '17-Jan-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100002, 300, 'Computer Graphics', '26-Jan-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100002, 400, 'DataBase Management', '10-Jan-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100003, -500, 'JavaScript Programming Language', '07-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100003, -600, 'SQL', '13-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100003, -200, 'Java Programming Language', '17-Jan-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100003, 300, 'Networking', '04-Feb-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100003, 400, 'DataBase Management', '05-Jan-2019 12:00:00 AM') INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE") VALUES (100003, 600, 'Cryptography', '18-Jan-2019 12:00:00 AM') SELECT * FROM dual ; SELECT STUDENT_ID ,COURSE_ID ,COURSE_NAME ,COURSE_START_DATE FROM ( SELECT ROW_NUMBER() OVER(PARTITION BY STUDENT_ID ORDER BY COURSE_START_DATE DESC) AS ROW_NUM ,STUDENT_ID ,COURSE_ID ,COURSE_NAME ,COURSE_START_DATE FROM COURSE WHERE COURSE_ID 0) TEMP1 WHERE TEMP1.ROW_NUM = 1;
But the problem is the real table is very large. There are almost 85k rows and this query is taking sometime. Is there any other better approach. I am using Oracle 11g R2. Please suggest
This is SQLfiddle link http://sqlfiddle.com/#!4/b3fe1/8
Upvotes: 2
Views: 147
Reputation: 29647
Another way to get those results is by using a NOT EXISTS
Because if you want the record with the lastest COURSE_START_DATE
for a student.
Then for that record, there won't exists any with a date that's higher.
(unless there's 2 with the same max date, then it would return both)
SELECT
STUDENT_ID,
COURSE_ID,
COURSE_NAME,
COURSE_START_DATE
FROM COURSE t
WHERE COURSE_ID != 0
AND NOT EXISTS
(
SELECT 1
FROM COURSE d
WHERE d.STUDENT_ID = t.STUDENT_ID
AND d.COURSE_START_DATE > t.COURSE_START_DATE
AND SIGN(d.COURSE_ID) = SIGN(t.COURSE_ID)
AND d.COURSE_ID != 0
)
ORDER BY SIGN(COURSE_ID), STUDENT_ID
Such query would probably benefit from a non-unique index on STUDENT_ID.
A test on db<>fiddle here
By the way, in Oracle 12c you could order by ROW_NUMBER, then fetch only the first with ties.
SELECT
STUDENT_ID,
COURSE_ID,
COURSE_NAME,
COURSE_START_DATE
FROM COURSE t
WHERE COURSE_ID != 0
ORDER BY row_number() over(partition by student_id, SIGN(COURSE_ID) order by course_start_date desc)
FETCH FIRST ROW WITH TIES
Upvotes: 0
Reputation: 167867
You can use your query and add SIGN( course_id )
to the partition:
Query:
SELECT STUDENT_ID
, COURSE_ID
, COURSE_NAME
, COURSE_START_DATE
FROM (
SELECT ROW_NUMBER() OVER (
PARTITION BY STUDENT_ID, SIGN( COURSE_ID )
ORDER BY COURSE_START_DATE DESC
) AS ROW_NUM
, STUDENT_ID
, COURSE_ID
, COURSE_NAME
, COURSE_START_DATE
FROM COURSE
WHERE COURSE_ID != 0
)
WHERE ROW_NUM = 1;
Output:
STUDENT_ID | COURSE_ID | COURSE_NAME | COURSE_START_DATE ---------: | --------: | :------------------------ | :--------------------------- 100001 | -200 | Java Programming Language | 11-FEB-19 12.00.00.000000 AM 100001 | 200 | Computer Graphics | 13-FEB-19 12.00.00.000000 AM 100002 | -300 | C# Programming Language | 12-FEB-19 12.00.00.000000 AM 100002 | 300 | Computer Graphics | 26-JAN-19 12.00.00.000000 AM 100003 | -600 | SQL | 13-FEB-19 12.00.00.000000 AM 100003 | 300 | Networking | 04-FEB-19 12.00.00.000000 AM
db<>fiddle here
Upvotes: 0
Reputation: 11195
Use CTEs with row_number, then union them
with pro as
(
select t1.*, row_number() over(partition by student_id order by course_start_date desc) rn
from course
where course_id > 0 -- programming
)
, nonpro as
(
select t1.*, row_number() over(partition by student_id order by course_start_date desc) rn
from course
where course_id < 0 -- non-programming
)
select *
from pro
where rn = 1
union
select *
from nonpro
where rn = 1
Upvotes: 0
Reputation: 37473
You can try below - you need to add PARTITION BY STUDENT_ID,case when course_id<0 then 1 else 2 end order by COURSE_START_DATE DESC
in over cluase
SELECT STUDENT_ID,COURSE_ID,COURSE_NAME,COURSE_START_DATE
FROM
(
SELECT
ROW_NUMBER() OVER(PARTITION BY STUDENT_ID,case when course_id<0 then 1 else 2 end order by COURSE_START_DATE DESC) AS ROW_NUM
, STUDENT_ID,COURSE_ID,COURSE_NAME,COURSE_START_DATE FROM COURSE
)TEMP1 WHERE TEMP1.ROW_NUM = 1;
Upvotes: 3