A l w a y s S u n n y
A l w a y s S u n n y

Reputation: 38502

PostgreSQL conditional join and result filtering

WHAT I WANT

the name of cast members who have starred in more than one of the same director’s movies (but not directed by themselves) and directed at least one movie themselves.

select * from moviesdirectors LIMIT 5

Top 5 movie directors row

movieid   director
6       Bob Persichetti
6       Peter Ramsey
6       Rodney Rothman
16      Sergio Leone
20      Spike Jonze

AND

select * from moviescast LIMIT 5

Top 5 movie moviescast row

movieid castname
427     Quei Tann
2878    Indrans
3272    Togo Igawa
517     Ajay Naidu
1608    Megan Sousa

WHAT I HAVE TRIED:

select * from moviescast LEFT JOIN moviesdirectors ON moviesdirectors.movieid=moviescast.movieid
where castname in (select director from moviesdirectors) AND castname!=director 
ORDER BY castname

Upvotes: 0

Views: 36

Answers (1)

Ftisiot
Ftisiot

Reputation: 1868

If I understand your question correctly you want to count people on the cast that:

  1. directed one or more movies
  2. have stared in more than one movie for the same director, but the movie shouldn't show them as co-directors.

If that is the case I replicated a similar dataset to yours with

create table moviesdirectors( movieid int, director varchar);
 create table moviescast( movieid int, castname varchar);
 
 insert into moviesdirectors values(6, 'Bob Persichetti');
 insert into moviesdirectors values(6, 'Peter Ramsey');
 insert into moviesdirectors values(6, 'Rodney Rothman');
 insert into moviesdirectors values(10, 'Sergio Leone');
 insert into moviesdirectors values(10, 'Spike Jonze');
 insert into moviesdirectors values(20, 'Spike Jonze');

 insert into moviescast values(6, 'Bob Persichetti');
 insert into moviescast values(20, 'Bob Persichetti');
 insert into moviescast values(10, 'Bob Persichetti');
 insert into moviescast values(6, 'Quei Tann');
 insert into moviescast values(20, 'Quei Tann');
 insert into moviescast values(10, 'Quei Tann');
 insert into moviescast values(6, 'Sergio Leone');
 insert into moviescast values(20, 'Sergio Leone');
 insert into moviescast values(6, 'Peter Ramsey');
 insert into moviescast values(20, 'Peter Ramsey');

The table moviesdirectors now contains the following

 movieid |    director     
---------+-----------------
       6 | Bob Persichetti
       6 | Peter Ramsey
       6 | Rodney Rothman
      10 | Sergio Leone
      10 | Spike Jonze
      20 | Spike Jonze
(6 rows)

and the table moviescast the following

 movieid |    castname     
---------+-----------------
       6 | Bob Persichetti
      20 | Bob Persichetti
      10 | Bob Persichetti
       6 | Quei Tann
      20 | Quei Tann
      10 | Quei Tann
       6 | Sergio Leone
      20 | Sergio Leone
       6 | Peter Ramsey
      20 | Peter Ramsey
(10 rows)

In the above dataset only Bob Persichetti working for Spike Jonze should satisfy the 2 conditions stated initially.

All the rest shouldn't since:

  • Quei Tann never directed
  • Sergio Leone casted in two movies but they were directed by different people
  • Peter Ramsey casted in two movies but one (movieid=6 was directed by himself too)

In order to provide the solution I split the problems in various steps in the following query:

  1. directed_by_themselves: retrieves the list of movieids where the director is also part of the cast
  2. directors: retrieves the distinct list of directors
  3. the last query glues everything up and removes from the resulting dataset all movies where an actor is also part of the directors
with directed_by_themselves as (
  select distinct moviesdirectors.movieid, 
         moviesdirectors.director 
  from moviesdirectors join moviescast 
  on moviesdirectors.movieid = moviescast.movieid 
  and moviesdirectors.director = moviescast.castname
 ),
 directors as (
 select distinct director from moviesdirectors) 
 
 select d.director, c.castname, count(*) nr_movies 
 from moviesdirectors d join moviescast c
 on (d.movieid = c.movieid)
 join directors dir on c.castname = dir.director
 where (d.movieid,c.castname) not in (select movieid,director from directed_by_themselves)
 group by d.director, c.castname
 having count(*) > 1;

the result is

  director   |    castname     | nr_movies 
-------------+-----------------+-----------
 Spike Jonze | Bob Persichetti |         2
(1 row)

Upvotes: 1

Related Questions