wulftone
wulftone

Reputation: 1708

MySQL: Count related records through a join table, where source can have multiple entries with the same ID

I'm having a hard time wrapping my head around this MySQL query. I've broken it down into its components as much as I can below. I've obfuscated the table names just for kicks (and giggles!) but the structure is the same.

Goal

We want to count the number of puppies that are linked to a kennel

Situation

The way it is, if there are multiple revisions of the same kennel, I get bloated puppy counts... Here's the SQL I'm working with:

SELECT
  kennel.id,
  COUNT(puppy.id) puppy_count
FROM kennel
JOIN relationship
  ON kennel.id = relationship.from_id AND (SELECT MAX(kennel.revision_id))
JOIN puppy
  ON puppy.id = relationship.to_id
JOIN kennel_entry
  ON kennel.id = kennel_entry.review_id
WHERE relationship.from_model = 'kennel'
AND relationship.to_model = 'puppy'
AND puppy.status = 2
AND kennel_entry.status = 3
GROUP BY kennel.id

The above query ALMOST works... but it seems to be reporting some odd numbers in some cases. Sometimes I get double or triple the correct counts. I'm not sure what's going on.

MySQL version: mysql Ver 14.14 Distrib 5.5.37, for debian-linux-gnu (x86_64) using readline 6.2

-- edit --

Here's my latest attempt, it's closer but it doesn't account for kennel_entry.status = 3:

SELECT
  kennel.id,
  COUNT(DISTINCT puppy.id) puppy_count
FROM kennel
JOIN relationship
  ON kennel.id = relationship.from_id
JOIN puppy
  ON puppy.id = relationship.to_id
JOIN kennel_entry
  ON kennel.id = kennel_entry.review_id
WHERE relationship.from_model = 'kennel'
AND relationship.to_model = 'puppy'
AND puppy.status = 2
AND kennel_entry.status = 3
GROUP BY kennel.id

Upvotes: 0

Views: 113

Answers (2)

KingOfAllTrades
KingOfAllTrades

Reputation: 421

Here would be a solution formatted as a common table expression.

WITH LastRev as (
  SELECT id, MAX(k.revision_id) max_rev_id
  FROM kennel k
  GROUP BY k.id)
SELECT
  kennel.id,
  COUNT(puppy.id) puppy_count
FROM kennel
JOIN LastRev lr on kennel.id = lr.id and kennel.revision_id = lr.max_rev_id
JOIN relationship
  ON kennel.id = relationship.from_id
JOIN puppy
  ON puppy.id = relationship.to_id
JOIN kennel_entry
  ON kennel.id = kennel_entry.review_id
WHERE relationship.from_model = 'kennel'
AND relationship.to_model = 'puppy'
AND puppy.status = 2
AND kennel_entry.status = 3
GROUP BY kennel.id

Upvotes: 0

FuzzyTree
FuzzyTree

Reputation: 32402

To only count kennels having the maximum revision_id for every id, add the following join

JOIN (SELECT id, MAX(revision_id) max_revision_id
        FROM kennel GROUP BY id) t1 
  ON t1.id = kennel.id AND t1.max_revision_id = kennel.revision_id

Upvotes: 2

Related Questions