MichalB
MichalB

Reputation: 3341

How to create this SQL query?

How to create this query (exercise included in Stanford Database MOOC):

For all cases where the same reviewer rated the same movie twice and gave it a higher rating the second time, return the reviewer's name and the title of the movie.

The system used is SQLite.

Table movie:

Field Type Null Key Default Extra
mID int(11) YES NULL
title text YES NULL
year int(11) YES NULL
director text YES NULL

Table rating:

Field Type Null Key Default Extra
rID int(11) YES NULL
mID int(11) YES NULL
stars int(11) YES NULL
ratingDate date YES NULL

Table reviewer:

Field Type Null Key Default Extra
rID int(11) YES NULL
name text YES NULL

Expected result:

name title
Sarah Martinez Gone with the Wind

The data.

Upvotes: 0

Views: 1245

Answers (3)

Aakash Mahawar
Aakash Mahawar

Reputation: 51

Try this as well:

select yes1.name,yes1.title from (select yes. name, yes. title,d.stars,d.rating date,(rank() over(partition by yes.name order by d.stars desc)) as rankk1,(rank() over(partition by yes.name order by d.ratingDate asc )) as rankk2
from (select c.mID,a.rID,a.name,c.title from Reviewer as a join Rating as b on a.rID=b.rID
join Movie as c on b.mID=c.mID group by a.name,c.title,c.mID,a.riD having count(*)=2) as yes join Rating as d on yes.mID = d.mID and yes.riD = d.rID) as yes1 where yes1.rankk1 = 1 and yes1.rankk2=2;

I followed the simple analytical approach and got the correct answer.

Upvotes: 0

Roberto
Roberto

Reputation: 3

I've managed to solve this specific exercise with this query:

SELECT R.name, M.title
FROM
    Rating AS RatingLatest
JOIN Rating AS R2 
    ON RatingLatest.rID = R2.rID AND R1.mID = R2.mID
JOIN Reviewer AS R USING (rID)
JOIN Movie AS M USING (mID)
-- Check if there is a newer rating with more stars than the previous one 
WHERE RatingLatest.ratingDate > R2.ratingDate 
AND RatingLatest.stars > R2.stars 

It returns reviewers where it has rated the same movie more than once and when, on the last time (not specifically the second), the rating was greater.

Upvotes: 0

ErikE
ErikE

Reputation: 50251

SELECT
   W.name,
   M.title
FROM
   reviewer AS R
   INNER JOIN movie AS M
      ON EXISTS ( -- there is at least one rating
         SELECT *
         FROM rating AS G
         WHERE
            -- by the reviewer and movie in question
            R.rID = G.rID
            AND M.mID = G.mID
            AND EXISTS ( -- for which another rating exists
               SELECT *
               FROM rating AS G2
               WHERE
                  -- for the same reviewer and movie
                  R.rID = G2.rID
                  AND M.mID = G2.mID
                  AND G.stars < G2.stars -- but rated higher
                  AND G.ratingDate < G2.ratingDate -- and later
            )
      )
;

I am not 100% sure if SQLite allows ON clauses to have EXISTS expressions. If not, you can just move the EXISTS expression to the WHERE clause and perform a cross join between reviewer and movie.

If SQLite doesn't support EXISTS, then put the EXISTS queries as a derived table in the FROM clause, with the two tables INNER JOINed to each other, and then GROUP BY the mId and the rID, then INNER JOIN to the main tables. That might look like this:

SELECT
   R.name,
   M.title
FROM
   (
      SELECT
         G.rID,
         G.mID
      FROM
         rating AS G
         INNER JOIN rating AS G2
            ON G.rID = G2.rID
            AND G.mID = G2.mID
            AND G.stars < G2.stars
            AND G.ratingDate < G2.ratingDate
      GROUP BY
         G.rID,
         G.mID
   ) C
   INNER JOIN reviewer AS R
      ON C.rID = R.rID
   INNER JOIN movie AS M
      ON C.mID = R.mID
;

I hope you can see how these two queries express the same semantics. In a very large database where people have rated the same movies many times, there could be a performance difference (the EXISTS version I first showed could perform better as it can stop as soon as it finds one result).

Note: You could just join the whole mess into a single query and GROUP BY the name, title, rID, mID, but while "simpler", that would be more wrong, as there is no need to duplicate the name and title for many rows, only to throw away that information by grouping. The grouping should happen as early as possible.

Upvotes: 2

Related Questions