Dragon Wolf
Dragon Wolf

Reputation: 61

basic sql: retrieve all data that does not contain specified values

BOOKID AUNAME
  1101 Dilbert
  1101 Emerson
  1101 Sartre
  1102 Axel
  1102 Sartre
  1103 Breese
  1103 Marquez
  1104 Groom
  1104 Young
  1105 Blake
  1105 Julian
  1105 Scott
  1105 Verde
  1105 Young
  1106 Black
  1106 Sartre

BOOKID TITLE
  1101 Java Earth
  1102 MySQL
  1103 Sql Song
  1104 For Sale
  1105 No Sharing
  1106 Pascal
  2007 Index Choice
  2008 Business

Ok, listed above are the two tables I am working with. We are required to make the query to list all bookid's, title's, and author names for all books that DO NOT include the author Sartre (3 total)

Here are two ways I attempted to do just that, both unsuccessfully: (note: query does not return all I listed above [i.e. title] because I need this part to work first)

select LITWORK.BOOKID, LITWORK_AUTHOR.AUNAME
from LITWORK, LITWORK_AUTHOR
where not exists (select BOOKID       
from LITWORK_AUTHOR
where AUNAME = 'Sartre');

RESULT SQL> start q13.sql

no rows selected

Not sure why this executed, but selected no rows confused on that:(.

Second attempt:

select LITWORK.BOOKID, LITWORK_AUTHOR.AUNAME
from LITWORK, LITWORK_AUTHOR
where LITWORK.BOOKID != (select BOOKID 
from LITWORK_AUTHOR
where AUNAME = 'Sartre');

Result SQL> start q13.sql where LITWORK.BOOKID != (select BOOKID * ERROR at line 3: ORA-01427: single-row subquery returns more than one row

So this time, I believe it populated a temp table with those Bookid's Sartre is part of, but returns more than one (since he authored three) and thus not working as I intend.....how to get sql to read and exclude those bookid's that Sartre is a part of one by one.....

how can I query / nested subquery properly to get the table to populate with all those bookids, titles, and author names for which the bookid does NOT include the author Sartre?

This is for a basic SQL class, I am trying to develop professionally, and feel silly getting stuck on this.

Up until now, we have learned SELECT, WHERE, NOT IN, IN, EXISTS, NOT EXISTS, JOIN, UNION, GROUP BY, ORDER BY....pretty much covers what we learned thus far....so the answer should not be outside of these things.

Upvotes: 0

Views: 212

Answers (6)

MT0
MT0

Reputation: 168806

Use an OUTER JOIN so that you get books that have no authors and you can use GROUP BY and HAVING to restrict it to books that do not have any of the authors you do not want.

SQL Fiddle

Oracle 11g R2 Schema Setup:

CREATE TABLE LITWORK_AUTHOR( BOOKID, AUNAME ) AS
SELECT  1101, 'Dilbert' FROM DUAL UNION ALL
SELECT  1101, 'Emerson' FROM DUAL UNION ALL
SELECT  1101, 'Sartre' FROM DUAL UNION ALL
SELECT  1102, 'Axel' FROM DUAL UNION ALL
SELECT  1102, 'Sartre' FROM DUAL UNION ALL
SELECT  1103, 'Breese' FROM DUAL UNION ALL
SELECT  1103, 'Marquez' FROM DUAL UNION ALL
SELECT  1104, 'Groom' FROM DUAL UNION ALL
SELECT  1104, 'Young' FROM DUAL UNION ALL
SELECT  1105, 'Blake' FROM DUAL UNION ALL
SELECT  1105, 'Julian' FROM DUAL UNION ALL
SELECT  1105, 'Scott' FROM DUAL UNION ALL
SELECT  1105, 'Verde' FROM DUAL UNION ALL
SELECT  1105, 'Young' FROM DUAL UNION ALL
SELECT  1106, 'Black' FROM DUAL UNION ALL
SELECT  1106, 'Sartre' FROM DUAL;

CREATE TABLE LITWORK ( BOOKID, TITLE ) AS
SELECT  1101, 'Java Earth' FROM DUAL UNION ALL
SELECT  1102, 'MySQL' FROM DUAL UNION ALL
SELECT  1103, 'Sql Song' FROM DUAL UNION ALL
SELECT  1104, 'For Sale' FROM DUAL UNION ALL
SELECT  1105, 'No Sharing' FROM DUAL UNION ALL
SELECT  1106, 'Pascal' FROM DUAL UNION ALL
SELECT  2007, 'Index Choice' FROM DUAL UNION ALL
SELECT  2008, 'Business' FROM DUAL;

Query 1:

SELECT l.BOOKID,
       MAX( TITLE ) AS TITLE,
       LISTAGG( AUNAME, ',' ) WITHIN GROUP ( ORDER BY AUNAME ) AS Authors
FROM   LITWORK l
       LEFT OUTER JOIN LITWORK_AUTHOR a
       ON ( l.BOOKID = a.BOOKID )
GROUP BY l.BOOKID
HAVING COUNT( CASE AUNAME WHEN 'Sartre' THEN 1 END ) = 0

Results:

| BOOKID |        TITLE |                        AUTHORS |
|--------|--------------|--------------------------------|
|   1103 |     Sql Song |                 Breese,Marquez |
|   1104 |     For Sale |                    Groom,Young |
|   1105 |   No Sharing | Blake,Julian,Scott,Verde,Young |
|   2007 | Index Choice |                         (null) |
|   2008 |     Business |                         (null) |

Upvotes: 0

Thorsten Kettner
Thorsten Kettner

Reputation: 95101

The first thing is to think about how the result should look like. A book has one title and possibly multiple authors. You could now either repeat the same book again and again with one author in each row or show one line for each book with the list of authors as a comma-separated string.

The latter method is an aggregation; you aggregate your data such as to get one book per result row. "per" translates to GROUP BY in SQL. Use HAVING to check whether a book meets your conditions.

select 
  bookid,
  l.title, 
  listagg(la.auname, ', ') within group (order by la.auname) as authors
from litwork l
join litwork_author la using (bookid)
group by bookid
having count(case when la.auname = 'Sartre' then 1 end) = 0
order by l.title;

Upvotes: 0

Niranjan Rajawat
Niranjan Rajawat

Reputation: 563

Here you go:

select a.BOOKID, a.TITLE, b.AUNAME
from LITWORK a
inner join LITWORK_AUTHOR b
   ON A.BOOKID = B.BOOKID
where a.BookID not In 
(select c.BOOKID from LITWORK_AUTHOR c where c.AUNAME = 'Sartre')

Upvotes: 2

xQbert
xQbert

Reputation: 35353

I'm not a fan of , joins in the from. That's the old ansi 87 standard... yes 1987. We're what now 30 years later? yes in IT years , froms should be dead now; but sadly they still exist.

Really all you're missing is the correlation between the exists and your from clause.

SELECT LW.BOOKID, LWA.AUNAME
FROM LITWORK LW
INNER JOIN LITWORK_AUTHOR LWA
 on LW.BookID = LWA.BookID
WHERE not exists (SELECT *       
                  FROM  LITWORK_AUTHOR LWA2
                  WHERE AUNAME = 'Sartre'
                    and LWA.BookID = LWA2.BookID);  <-- Your query is just missing this line

or change it to a not In instead of an exists.

select LITWORK.BOOKID, LITWORK_AUTHOR.AUNAME
from LITWORK, LITWORK_AUTHOR
where LitWork.BookID not In (select BOOKID          <--or change this to a not in instead of exists and add bookID.
                             from LITWORK_AUTHOR
                             where AUNAME = 'Sartre');

I generally try to use not exists over not in as not in's handling of nulls can cause issues; and generally speaking I've seen not exists perform faster (though this is NOT always the case)

My general rule of thumb on tables is only join tables if you need values from them in your select. If you don't, they should be in the where clause as limits on the data you desire.

and while I don't recommend it as the overhead of the derived table and additional join can get expensive vs the not exists, not in.

SELECT LITWORK.BOOKID, LITWORK_AUTHOR.AUNAME
FROM LITWORK LW
INNER JOIN LITWORK_AUTHOR LWA
 on LW.BookID = LWA.BookID
LEFT JOIN (select BookID from LITWORK_AUTHOR LWA2 where AUNAME = 'Sartre') BooksToExclude
 on booksToExclude.BookID = LWA.BookID
WHERE BooksToExclude.BookID is null 

This last one says join the books and authors. Then Include all results and only those results from BoooksToExclude having Satre as an author.

Then the outer where clause says if Satre existed as an author in the BooksToExclude derived table then exclude those books.

The reason your second query didn't work is because your where clause is comparing a scalar result to a non scalar result which simply doesn't work..

1 value can't be compared to many values using the = or != operands.

where LITWORK.BOOKID != (select BOOKID 
from LITWORK_AUTHOR
where AUNAME = 'Sartre');

!= compares 1 value to 1 value much the same as =. if you need to compare 1 value to many values you need to use {Not} IN.

The reason your first query didn't work is simply because they are not correlated and you have a cross join. , means all records from LitWORK tied to all records of LitWork_Author. but you need the two to be tied on Book_ID. This is one reason why JOIN syntax is used as it avoids accidental cross joins when where clauses are omitted.

But if you must keep with the ANSI 87 standard...

select LITWORK.BOOKID, LITWORK_AUTHOR.AUNAME
from LITWORK, LITWORK_AUTHOR
where not exists (select A.BOOKID       
from LITWORK_AUTHOR A
where A.AUNAME = 'Sartre'
  and A.BookID = LitWork_Author.Book_ID)
 and LitWork.BookID = LitWork_Author.BookID

Upvotes: 0

MatBailie
MatBailie

Reputation: 86808

Although the often used NOT IN syntax does work, dependent on the data and the RDBMS, it can be exceedingly expensive and slow. (Especially if the list inside the NOT IN ever becomes large.)

As such you may wish to consider alternatives that have different performance characteristics (often better, but not always, and all vary dependent on the data in your tables)...

SELECT
    *
FROM
    litwork
INNER JOIN
    litwork_author
        ON  litwork_author.BookID = litwork.BookID
INNER JOIN
(
    SELECT
        BookID
    FROM
        litwork_author
    GROUP BY
        BookID
    HAVING
        SUM(CASE WHEN AuName = 'Sartre' THEN 1 ELSE 0 END) = 0  -- Return 1 row per book, where the total number of authors called 'Sartre' is 0
)
    validLitWork
        ON  validLitWork.BookID = litwork.BookID

What would likely perform well in a situation where the number books involving 'Sartre' is relatively high.

In the converse, where the number of books involving 'Sartre' is relatively low, I'd be inclined to go the opposite way...

SELECT
    *
FROM
    litwork
INNER JOIN
    litwork_author
        ON  litwork_author.BookID   = litwork.BookID
LEFT JOIN
    litwork_author   excluded_litwork
        ON  excluded_litwork.BookID = litwork.BookID
        AND excluded_litwork.AuName = 'Sartre'
WHERE
    excluded_litwork.AuName IS NULL

Upvotes: 1

David W
David W

Reputation: 10184

The first query did not return values because of a misapplication of 'not exists'. The intent I'm inferring from the first query is to generate a list of books where the book's ID is not in the list of books authored by Satre:

select LITWORK.BOOKID, LITWORK_AUTHOR.AUNAME
from LITWORK, LITWORK_AUTHOR
where LITWORK.BOOKID in (select BOOKID       
from LITWORK_AUTHOR 
where AUNAME <> 'SARTRE');

The second approach returns the error indicated because your subquery is returning a list of bookid's, but the comparison can only occur against one record at a time - not a list. A single value in a single record can't be compared against a set of values. That's when you go back to the operation of the 'in' operator - asking the question 'Is value x in the following list?'

In reality, however, the better and likely intended approach is to use a join, one in which I would suggest the use of ANSI-standard join syntax rather than the comma-delimited syntax (and in adding the proper modifications to eliminate the issues pointed out by xQBert, this solution now probably does have too many joins, but after many knuckle-raps from DBA's over use of 'in' and 'not exists' (especially), I tend to try the join first, then see if I can improve from there :) :) ).

The illustration here is to create the list of books, joined to the author table, then outer joined to what we could term the "illegal book list" of those authored by Sartre - and also eliminating the issue of a null author value :

Select a.*
  from litwork a
  join litwork_author b
    on a.bookid=b.bookid
  left outer join (select bookid
                     from litwork_author
                    where auname='SARTRE') c
    on a.bookid=c.bookid
  where c.bookid is null

Upvotes: 1

Related Questions