SwapnaSubham Das
SwapnaSubham Das

Reputation: 537

how to retrieve latest data from a table

Currently I am working on a project where I need to extract latest data from a table for a report purpose. Below is the sample table structure:-

enter image description here Every student has several courses and course_id for programming language is +ve and for non programming language -ve. I want to extract latest programming language and non programming language course_id for each and every student.

I use the below SQL query and able to extract the data.

CREATE TABLE COURSE
    ("STUDENT_ID" int, "COURSE_ID" int, "COURSE_NAME" varchar2(31), "COURSE_START_DATE" timestamp)
;

INSERT ALL 
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100001, -100, 'C Programming Language', '04-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100001, -200, 'Java Programming Language', '11-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100001, -300, 'C# Programming Language', '07-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100001, 100, 'Data Structure and algorithms', '05-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100001, 200, 'Computer Graphics', '13-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100001, 300, 'Networking', '02-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100002, -300, 'C# Programming Language', '12-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100002, -400, 'Python Programming Language', '07-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100002, -500, 'JavaScript Programming Language', '08-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100002, 100, 'Data Structure and algorithms', '17-Jan-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100002, 300, 'Computer Graphics', '26-Jan-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100002, 400, 'DataBase Management', '10-Jan-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100003, -500, 'JavaScript Programming Language', '07-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100003, -600, 'SQL', '13-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100003, -200, 'Java Programming Language', '17-Jan-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100003, 300, 'Networking', '04-Feb-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100003, 400, 'DataBase Management', '05-Jan-2019 12:00:00 AM')
    INTO COURSE ("STUDENT_ID", "COURSE_ID", "COURSE_NAME", "COURSE_START_DATE")
         VALUES (100003, 600, 'Cryptography', '18-Jan-2019 12:00:00 AM')
SELECT * FROM dual
;

SELECT STUDENT_ID
,COURSE_ID
,COURSE_NAME
,COURSE_START_DATE
  FROM (
SELECT 
ROW_NUMBER() OVER(PARTITION BY STUDENT_ID ORDER BY COURSE_START_DATE DESC) AS ROW_NUM
,STUDENT_ID
,COURSE_ID
,COURSE_NAME
,COURSE_START_DATE
  FROM
COURSE
WHERE COURSE_ID  0) TEMP1 WHERE TEMP1.ROW_NUM = 1;

enter image description here

But the problem is the real table is very large. There are almost 85k rows and this query is taking sometime. Is there any other better approach. I am using Oracle 11g R2. Please suggest

This is SQLfiddle link http://sqlfiddle.com/#!4/b3fe1/8

Upvotes: 2

Views: 147

Answers (4)

LukStorms
LukStorms

Reputation: 29647

Another way to get those results is by using a NOT EXISTS

Because if you want the record with the lastest COURSE_START_DATE for a student.
Then for that record, there won't exists any with a date that's higher.
(unless there's 2 with the same max date, then it would return both)

SELECT 
 STUDENT_ID, 
 COURSE_ID, 
 COURSE_NAME, 
 COURSE_START_DATE
FROM COURSE t
WHERE COURSE_ID != 0
AND NOT EXISTS
(
  SELECT 1
  FROM COURSE d
  WHERE d.STUDENT_ID = t.STUDENT_ID
    AND d.COURSE_START_DATE > t.COURSE_START_DATE
    AND SIGN(d.COURSE_ID) = SIGN(t.COURSE_ID)
    AND d.COURSE_ID != 0
)
ORDER BY SIGN(COURSE_ID), STUDENT_ID

Such query would probably benefit from a non-unique index on STUDENT_ID.

A test on db<>fiddle here

By the way, in Oracle 12c you could order by ROW_NUMBER, then fetch only the first with ties.

SELECT 
 STUDENT_ID, 
 COURSE_ID, 
 COURSE_NAME, 
 COURSE_START_DATE
FROM COURSE t
WHERE COURSE_ID != 0
ORDER BY row_number() over(partition by student_id, SIGN(COURSE_ID) order by course_start_date desc)
FETCH FIRST ROW WITH TIES

Upvotes: 0

MT0
MT0

Reputation: 167867

You can use your query and add SIGN( course_id ) to the partition:

Query:

SELECT STUDENT_ID
     , COURSE_ID
     , COURSE_NAME
     , COURSE_START_DATE
FROM (
  SELECT ROW_NUMBER() OVER (
           PARTITION BY STUDENT_ID, SIGN( COURSE_ID )
           ORDER BY COURSE_START_DATE DESC
         ) AS ROW_NUM
       , STUDENT_ID
       , COURSE_ID
       , COURSE_NAME
       , COURSE_START_DATE
  FROM   COURSE
  WHERE COURSE_ID != 0
)
WHERE  ROW_NUM = 1;

Output:

STUDENT_ID | COURSE_ID | COURSE_NAME               | COURSE_START_DATE           
---------: | --------: | :------------------------ | :---------------------------
    100001 |      -200 | Java Programming Language | 11-FEB-19 12.00.00.000000 AM
    100001 |       200 | Computer Graphics         | 13-FEB-19 12.00.00.000000 AM
    100002 |      -300 | C# Programming Language   | 12-FEB-19 12.00.00.000000 AM
    100002 |       300 | Computer Graphics         | 26-JAN-19 12.00.00.000000 AM
    100003 |      -600 | SQL                       | 13-FEB-19 12.00.00.000000 AM
    100003 |       300 | Networking                | 04-FEB-19 12.00.00.000000 AM

db<>fiddle here

Upvotes: 0

JohnHC
JohnHC

Reputation: 11195

Use CTEs with row_number, then union them

with pro as 
(
select t1.*, row_number() over(partition by student_id order by course_start_date desc) rn
from course 
where course_id > 0 -- programming
)
, nonpro as 
(
select t1.*, row_number() over(partition by student_id order by course_start_date desc) rn
from course 
where course_id < 0 -- non-programming
)
select *
from pro
where rn = 1
union
select *
from nonpro
where rn = 1

Upvotes: 0

Fahmi
Fahmi

Reputation: 37473

You can try below - you need to add PARTITION BY STUDENT_ID,case when course_id<0 then 1 else 2 end order by COURSE_START_DATE DESC in over cluase

SELECT STUDENT_ID,COURSE_ID,COURSE_NAME,COURSE_START_DATE
FROM 
(
  SELECT 
  ROW_NUMBER() OVER(PARTITION BY STUDENT_ID,case when course_id<0 then 1 else 2 end order by COURSE_START_DATE DESC) AS ROW_NUM
, STUDENT_ID,COURSE_ID,COURSE_NAME,COURSE_START_DATE FROM COURSE
 )TEMP1 WHERE TEMP1.ROW_NUM = 1;

Upvotes: 3

Related Questions