Ben
Ben

Reputation: 25

MySQL query with multiple counts

I'm a little stuck with a MySQL query and need some help.

Let's assume we have a table "animals" with these columns:

animal_id | animal_species
1         | dog
2         | cat
3         | dog
4         | pig

Furthermore we have a table "examinations" that includes various examinations on these animals:

examination_id | examination_animal_id
1              | 1
2              | 1
3              | 3
4              | 2
5              | 4

Finally we have another table "diagnoses_statistics" that contains one or multiple diagnoses for some examinations:

diagnoses_statistics_id | diagnoses_statistics_examination_id | diagnoses_statistics_diagnosis
1                       | 1                                   | luxation
2                       | 3                                   | thrombopathy
3                       | 3                                   | trauma
4                       | 3                                   | luxation
5                       | 5                                   | trauma

All tables contain many other columns, I just shortened them as above examples should be enough to illustrate my question.

We want to know how many identical diagnoses there are per species. To do this, we could run this query:

SELECT animals.animal_species,
       diagnoses_statistics.diagnoses_statistics_diagnosis,
       COUNT(diagnoses_statistics.diagnoses_statistics_diagnosis) AS diagnoses_count
FROM examinations
JOIN animals ON animals.animal_id = examinations.examination_animal_id
JOIN diagnoses_statistics ON diagnoses_statistics.diagnoses_statistics_examination_id = examinations.examination_id
GROUP BY animals.animal_species,
         diagnoses_statistics.diagnoses_statistics_diagnosis
ORDER BY animals.animal_species,
         diagnoses_statistics.diagnoses_statistics_diagnosis

This gives us something like this:

animal_species | diagnoses_statistics_diagnosis | diagnoses_count
dog            | luxation                       | 2
dog            | thrombopathy                   | 1
dog            | trauma                         | 1
pig            | trauma                         | 1

So far, so good. Now we want to add a column with a sum of every species that contains any diagnosis:

animal_species | diagnoses_statistics_diagnosis | diagnoses_count | species_count
dog            | luxation                       | 2               | 2
dog            | thrombopathy                   | 1               | 2
dog            | trauma                         | 1               | 2
pig            | trauma                         | 1               | 1

To achieve this, I tried this code which seems to produce correct values in "species_count", but it breaks the column "diagnoses_count":

SELECT animals.animal_species,
       diagnoses_statistics.diagnoses_statistics_diagnosis,
       COUNT(diagnoses_statistics.diagnoses_statistics_diagnosis) AS diagnoses_count,
       species_count.species_count
FROM examinations
JOIN animals ON animals.animal_id = examinations.examination_animal_id
JOIN diagnoses_statistics ON diagnoses_statistics.diagnoses_statistics_examination_id = examinations.examination_id
JOIN
    (SELECT animals.animal_species,
            diagnoses_statistics.diagnoses_statistics_diagnosis,
            COUNT(animals.animal_id) AS species_count
     FROM examinations
     JOIN animals ON animals.animal_id = examinations.examination_animal_id
     JOIN diagnoses_statistics ON diagnoses_statistics.diagnoses_statistics_examination_id = examinations.examination_id
     GROUP BY animals.animal_species,
              diagnoses_statistics.diagnoses_statistics_diagnosis
    ) AS species_count ON species_count.animal_species = animals.animal_species
GROUP BY animals.animal_species,
         diagnoses_statistics.diagnoses_statistics_diagnosis
ORDER BY animals.animal_species,
         diagnoses_statistics.diagnoses_statistics_diagnosis

Maybe I am doing this completely wrong.

Can you please help me how to solve this?

Thank you very much for your help in advance.

Upvotes: 1

Views: 47

Answers (2)

Steve Lovell
Steve Lovell

Reputation: 2574

You were pretty close. I think this is what you what. Note the COUNT(DISTINCT ...) in the subquery:

SELECT
    a.animal_species,
    d.diagnoses_statistics_diagnosis,
    COUNT(1) as diagnoses_count,
    spc.species_count
FROM
    animals a
    JOIN examinations e ON
        e.examination_animal_id = a.animal_id
    JOIN diagnoses_statistics d ON
        d.diagnoses_statistics_examination_id = e.examination_id
    JOIN (SELECT 
              a2.animal_species,
              COUNT(DISTINCT a2.animal_id) as species_count
          FROM
              animals a2
              JOIN examinations e2 ON
                  e2.examination_animal_id = a2.animal_id
              JOIN diagnoses_statistics d2 ON
                  d2.diagnoses_statistics_examination_id = e2.examination_id
          GROUP BY
              a2.animal_species         
         ) spc ON
         spc.animal_species = a.animal_species
GROUP BY
    a.animal_species,
    d.diagnoses_statistics_diagnosis,  
    spc.species_count

You can try this out over at rextester.com/XXQ94494. Hat-tip to @P.Salmon for the code building the data set.

Upvotes: 1

P.Salmon
P.Salmon

Reputation: 17665

I don't get how you get 2 in the species count for dog since there are 3 diagnoses for this species. If 3 is correct you could use a sub query.

DROP TABLE IF EXISTS ANIMALS;
CREATE TABLE ANIMALS (animal_id INT, animal_species VARCHAR(3));
INSERT INTO ANIMALS VALUES
(1         , 'dog'),
(2         , 'cat'),
(3         , 'dog'),
(4         , 'pig');

DROP TABLE IF EXISTS EXAMINATIONS;
CREATE TABLE EXAMINATIONS (examination_id INT, examination_animal_id INT);
INSERT INTO EXAMINATIONS VALUES
(1              , 1),
(2              , 1),
(3              , 3),
(4              , 2),
(5              , 4);

DROP TABLE IF EXISTS diagnoses_statistics;
CREATE TABLE diagnoses_statistics (diagnoses_statistics_id INT, diagnoses_statistics_examination_id INT, diagnoses_statistics_diagnosis VARCHAR(20));
INSERT INTO diagnoses_statistics VALUES
(1                       , 1                                   , 'luxation'),
(2                       , 3                                   , 'thrombopathy'),
(3                       , 3                                   , 'trauma'),
(4                       , 3                                   , 'luxation'),
(5                       , 5                                   , 'trauma');

SELECT A.ANIMAL_SPECIES,  D.diagnoses_statistics_diagnosis, COUNT(*) DIAGNOSES_COUNT
    ,(SELECT COUNT(DISTINCT diagnoses_statistics_diagnosis)
    FROM ANIMALS A1
    JOIN EXAMINATIONS E ON E.EXAMINATION_ANIMAL_ID = A1.ANIMAL_ID
    JOIN diagnoses_statistics D ON D.diagnoses_statistics_EXAMINATION_ID = E.EXAMINATION_ID
    WHERE A1.ANIMAL_SPECIES = A.ANIMAL_SPECIES) SPECIES_COUNT
FROM ANIMALS A
JOIN EXAMINATIONS E ON E.EXAMINATION_ANIMAL_ID = A.ANIMAL_ID
JOIN diagnoses_statistics D ON D.diagnoses_statistics_EXAMINATION_ID = E.EXAMINATION_ID
GROUP BY A.ANIMAL_SPECIES,  D.diagnoses_statistics_diagnosis
ORDER BY a.animal_species,
         d.diagnoses_statistics_diagnosis

Result

+----------------+--------------------------------+-----------------+---------------+
| ANIMAL_SPECIES | diagnoses_statistics_diagnosis | DIAGNOSES_COUNT | SPECIES_COUNT |
+----------------+--------------------------------+-----------------+---------------+
| dog            | luxation                       |               2 |             3 |
| dog            | thrombopathy                   |               1 |             3 |
| dog            | trauma                         |               1 |             3 |
| pig            | trauma                         |               1 |             1 |
+----------------+--------------------------------+-----------------+---------------+
4 rows in set (0.00 sec)

Upvotes: 0

Related Questions