Reputation: 410632
I know the title of this question is a bit confusing, so bear with me. :)
I have a (MySQL) database with a Person
record. A Person
also has a slug field. Unfortunately, slug fields are not unique. There are a number of duplicate records, i.e., the records have different IDs but the same first name, last name, and slug. A Person
may also have 0 or more associated articles, blog entries, and podcast episodes.
If that's confusing, here's a diagram of the structure:
(source: cbstaff.com)
I would like to produce a list of records that match this criteria: duplicate records (i.e., same slug field) for people who also have at least 1 article, blog entry, or podcast episode.
I have a SQL query that will list all records with the same slug fields:
SELECT
id,
first_name,
last_name,
slug,
COUNT(slug) AS person_records
FROM
people_person
GROUP BY
slug
HAVING
(COUNT(slug) > 1)
ORDER BY
last_name, first_name, id;
But this includes records for people that may not have at least 1 article, blog entry, or podcast. Can I tweak this to fit the second criteria?
Edit:
I updated the database diagram to simplify it and make it more clear what I am doing. (Note, some of the DB table names changed -- I was trying to give a higher-level look at the structure before, but it was a bit unclear.)
Upvotes: 4
Views: 644
Reputation: 64635
Select P.id, P.first_name, P.last_name, P.slug
From people_person as P
Join (
Select P1.slug
From people_person As P1
Where Exists (
Select 1
From magazine_author As ma1
Where ma1.person_id = P1.id
Union All
Select 1
From podcast_episode_guests As pod1
Where pod1.person_id = P1.Id
Union All
Select 1
From blogs_blog_authors As b1
Where b1.person_id = P1.Id
)
Group By P1.slug
Having Count(*) > 1
) As dup_slugs
On dup_slugs.slug = P.slug
Order By P.last_name, P.first_name, P.id
Upvotes: 2
Reputation: 485
The other sql statements in the question and other answers are all incorrect, I will try to explain how to avoid the chicken and egg problem using a function (which makes the code much clearer):
SELECT first_name,
last_name,
slug,
COUNT(slug) AS person_records,
SUM(get_count_articles(id)) AS total_articles
FROM people_person
GROUP BY first_name,
last_name,
slug
HAVING COUNT(*) > 1 AND SUM(get_count_articles(id))>=1
ORDER BY last_name, first_name;
With the function (written in Oracle syntax, please excuse my lack of knowledge of mysql functions).
FUNCTION get_count_articles(p_id NUMBER) RETURNS NUMBER IS
l_mag_auth NUMBER;
l_pod_guests NUMBER;
l_blog_auth NUMBER;
BEGIN
SELECT COUNT(*)
INTO l_mag_auth
FROM magazine_author ma1, article a1
WHERE ma1.person_id = p_id;
SELECT COUNT(*)
INTO l_pod_guests
FROM podcast_episode_guests As pod1
WHERE pod1.person_id = p_id;
SELECT COUNT(*)
INTO l_blog_auth
FROM blogs_blog_authors As b1
WHERE b1.person_id = p_id;
RETURN l_mag_auth+l_pod_guests+l_blog_auth;
END;
Note1: The magazine_author should be linked to article as above because the there may not actually be an article.
Note2: I have removed the ID from the original questions select and group by because it will force the wrong answer (as id should be unique in the table no record will be returned EVER). The syntax count(slug) may be confusing the issue here. If the output requires both of the duplicate rows then you MUST re-link to the people_person table to show the list of id's for the slug.
Upvotes: 0
Reputation: 30035
SELECT
id,
first_name,
last_name,
slug,
COUNT(slug) AS person_records,
FROM
people_person
WHERE
id IN (SELECT person_id from podcast_guests GROUP BY person_id) OR
id IN (SELECT person_id from authors GROUP BY person_id) OR
[....]
GROUP BY
slug
HAVING
(COUNT(slug) > 1)
ORDER BY
last_name, first_name, id;
Upvotes: 1
Reputation: 24132
You could perhaps handle it through the having clause:
select Id
, last_name
, first_name
, slug
, COUNT(*) as Person_Records
from Person as p
group by Id
, last_name
, first_name
, slug
having COUNT(slug) > 1
and (
select COUNT(*)
from Author as a
where a.Person_Id = p.Id
) > 1
and (
select COUNT(*)
from Podcast_Guests as pg
where pg.Person_Id = p.Id
) > 1
I omitted the remaining conditions as this is a simple sample.
I hope this helps! =)
Upvotes: 1
Reputation: 53830
You can still include a WHERE clause to filter the results:
SELECT
id,
first_name,
last_name,
slug,
COUNT(slug) AS person_records
FROM
people_person
WHERE id IN (SELECT id FROM article)
GROUP BY
slug
HAVING
(COUNT(slug) > 1)
ORDER BY
last_name, first_name, id;
Upvotes: 1