Reputation: 9
I have a dataset of million entries, its comprised of songs and their artists.
I have
a track_id
an artist_id.
There are 3 tables
tracks (track_id, title, artist_id),
artists(artist_id and artist_name) and
artist_term (artist_id and term).
Using only one query, I have to count the number of tracks whose artists don't have any linked terms.
For more reference, the schema of the DB is as follows:
CREATE TABLE tracks (track_id text PRIMARY KEY, title text, release text, year int, duration real, artist_id text);
CREATE TABLE artists (artist_id text, artist_name text);
CREATE TABLE artist_term (artist_id text, term text, FOREIGN KEY(artist_id)
REFERENCES artists(artist_id));
How do I get to the solution? please help!
Upvotes: 0
Views: 88
Reputation: 21
If I'm not mistaken, such a query could be built in a similar fashion like its sibling SQL languages. If so, it should look something like this:
SELECT COUNT(track_id)
FROM tracks as t
WHERE EXISTS (
SELECT *
FROM artists AS a
WHERE a.artist_id = t.artist_id
AND NOT EXISTS(
SELECT *
FROM artist_term as at
WHERE at.artist_id = a.artist_id
)
)
So this query basically says: count the number of different tracks (marked by their unique track_id
), where there is an artist that has the same artist_id
, where no artist_term
exists that refers to the artist_id
of the artist.
Hope this helps!
Upvotes: 0
Reputation: 222572
You can use not exists
:
select count(*) cnt
from tracks t
where not exists (select 1 from artist_term at where at.artist_id = t.artist_id)
As far as concerns you do not need to bring in the artists
table since artist_id
is available in both tracks
and artist_term
tables.
For performance you want an index on tracks(artist_id)
and another one on artist_term(artist_id)
.
An anti-left join
would also get the job done:
select count(*) cnt
from tracks t
left join artist_term at on at.artist_id = t.artist_id
where at.artist_id is null
Upvotes: 1
Reputation: 164139
You can join the tables tracks
and artists
and left join the table artist_term
so to find the unmatched artist_id
s:
select count(distinct t.track_id)
from tracks t
inner join artists a on a.artist_id = t.artist_id
left join artist_term at on at.artist_id = a.artist_id
where at.artist_id is null
The condition at.artist_id is null
in the WHERE
clause will return only the unmatched rows which will be counted.
Upvotes: 0