anon
anon

Reputation: 9

Python: use one sqlite query to find the NOT EXISTS result

I have a dataset of million entries, its comprised of songs and their artists.

I have

a track_id
an artist_id.

There are 3 tables

tracks (track_id, title, artist_id),
artists(artist_id and artist_name) and
artist_term (artist_id and term).

Using only one query, I have to count the number of tracks whose artists don't have any linked terms.

For more reference, the schema of the DB is as follows:

CREATE TABLE tracks (track_id text PRIMARY KEY, title text, release text, year int, duration real, artist_id text);
CREATE TABLE artists (artist_id text, artist_name text);
CREATE TABLE artist_term (artist_id text, term text, FOREIGN KEY(artist_id) 
REFERENCES artists(artist_id));

How do I get to the solution? please help!

Upvotes: 0

Views: 88

Answers (3)

Twinters
Twinters

Reputation: 21

If I'm not mistaken, such a query could be built in a similar fashion like its sibling SQL languages. If so, it should look something like this:

SELECT COUNT(track_id)
FROM tracks as t
WHERE EXISTS (
    SELECT *
    FROM artists AS a
    WHERE a.artist_id = t.artist_id
    AND NOT EXISTS(
        SELECT *
        FROM artist_term as at
        WHERE at.artist_id = a.artist_id
    )
)

So this query basically says: count the number of different tracks (marked by their unique track_id), where there is an artist that has the same artist_id, where no artist_term exists that refers to the artist_id of the artist.

Hope this helps!

Upvotes: 0

GMB
GMB

Reputation: 222572

You can use not exists:

select count(*) cnt
from tracks t
where not exists (select 1 from artist_term at where at.artist_id = t.artist_id)

As far as concerns you do not need to bring in the artists table since artist_id is available in both tracks and artist_term tables.

For performance you want an index on tracks(artist_id) and another one on artist_term(artist_id).

An anti-left join would also get the job done:

select count(*) cnt
from tracks t
left join artist_term at on at.artist_id = t.artist_id
where at.artist_id is null

Upvotes: 1

forpas
forpas

Reputation: 164139

You can join the tables tracks and artists and left join the table artist_term so to find the unmatched artist_ids:

select count(distinct t.track_id)
from tracks t
inner join artists a on a.artist_id = t.artist_id
left join artist_term at on at.artist_id = a.artist_id
where at.artist_id is null

The condition at.artist_id is null in the WHERE clause will return only the unmatched rows which will be counted.

Upvotes: 0

Related Questions