Sebastian
Sebastian

Reputation: 86

Select count(*) and "join" over 3 tables

I am looking for a count query within a join of 3 tables that can get me counts on distinct values of one of these tables.

I have 3 tables that I need to join to get the expected data (Workflow, Message and Message_Workflow_Relation).

I want to get the count of workflows grouped by status + one field of the connected Message table in my result (related_name). The related name should be taken from entries where the adapter field equals wf, however there are sometimes more than one Message record that matches this condition, which will result in more datasets in my count then the ones that are really there.

I am pretty sure it must be possible to sort out the , but just don't get it working. Sadly I cannot change the table structure as it's a given schema of a product we use.

My table structure looks like this:

Workflow:

id | workflow_id | starttime | endtime | status
------------------------------------------------------
1  |          22 |         0 |     200 |     OK
2  |          23 |       220 |     920 |  ERROR
3  |          55 |       202 |     588 |     OK

Message_Workflow_Relation:

id | message_id | workflow_id |
-------------------------------
1  |        122 |          22 |
2  |        235 |          22 |
3  |        456 |          22 |
4  |        982 |          22 |
5  |        444 |          23 |
6  |        445 |          23 |
7  |        585 |          55 |
8  |        738 |          55 |
9  |        399 |          55 |

Message:

id | message_id | starttime | endtime | adapter | related_name |
----------------------------------------------------------------
 1 |        122 |         0 |    2335 |      wf |   workflow_1 |
 2 |        235 |       222 |    1000 |   other |        other |
 3 |        456 |       343 |    2330 | another |      another |
 4 |        982 |       222 |    2200 |      wf |   workflow_1 |
 5 |        444 |      2223 |    3333 |      wf |   workflow_2 |
 6 |        445 |      1123 |    1244 |  manual |       manual |
 7 |        585 |      5555 |    5566 |      wf |   workflow_1 |
 8 |        738 |       655 |     999 |      wf |   worfklow_1 |
 9 |        399 |      6655 |    7732 | another |      another |

This should return the following result:

count(*) | related_name | status |
----------------------------------
       2 |   workflow_1 |     OK |
       1 |   workflow_2 |  ERROR |

I am stuck with this following statement, but I am not sure how to make the selection on the adapter = wf unique for each workflow:

select distinct
  count(*),
  m.related_name,
  w.status
from
  workflow as w,
  message as m,
  msg_bpm_rel as rel
where rel.workflow_id = w.workflow_id
  and rel.message_id = m.message_id
  and m.adapter = 'PE'
group by m.related_name,w.status

This returns me (4 workflow_1 instead of 2):

count(*) | related_name | status |
----------------------------------
       4 |   workflow_1 |     OK |
       1 |   workflow_2 |  ERROR |

How can I make a correct query to achieve this?

Any help appreciated.

Upvotes: 6

Views: 42480

Answers (3)

Sukhdeep Singh
Sukhdeep Singh

Reputation: 1

    SELECT distinct theCount
    ,m.related_name
    ,w.status
    FROM workflow as w
    ,message as m
    ,msg_bpm_rel as rel
    ,(SELECT count(1) as theCount
    ,w.workflow_id as wf_id
    FROM workflow as w
    ,message as m
    ,msg_bpm_rel as rel 
    WHERE rel.workflow_id = w.workflow_id 
    AND   rel.message_id = m.message_id 
    AND   m.adapter = 'wf' 
    GROUP BY w.workflow_id) AS t
    WHERE t.wf_id = w.workflow_id 
    AND   rel.workflow_id = w.workflow_id 
    AND   rel.message_id = m.message_id 
    AND   m.adapter = 'wf'

Upvotes: 0

Stephen Perelson
Stephen Perelson

Reputation: 6783

My first attempt at getting the query to work. I don't like having to use the distinct. It makes me think there might still be something wrong:

SELECT distinct theCount
      ,m.related_name
      ,w.status
FROM workflow as w
    ,message as m
    ,msg_bpm_rel as rel
    ,(SELECT count(1) as theCount
            ,w.workflow_id as wf_id
      FROM workflow as w
          ,message as m
          ,msg_bpm_rel as rel 
      WHERE rel.workflow_id = w.workflow_id 
      AND   rel.message_id = m.message_id 
      AND   m.adapter = 'wf' 
      GROUP BY w.workflow_id) AS t

WHERE t.wf_id = w.workflow_id 
AND   rel.workflow_id = w.workflow_id 
AND   rel.message_id = m.message_id 
AND   m.adapter = 'wf'

What is important to note is how the count is performed in this query. You were simply grouping on two columns that happened to be in the SELECT part of the SQL statement. The reality is that to count what you wanted, you have to group by the workflow ID only. This query does that and then feeds that result into another query to display what you wanted.

Upvotes: 0

F.B. ten Kate
F.B. ten Kate

Reputation: 2032

You can do this by grouping and counting a distinct value.

So something like:

select count(distinct w.workflow_id), m.related_name,w.status 
from workflow as w, message as m, msg_bpm_rel as rel 
where rel.workflow_id = w.workflow_id and rel.message_id = m.message_id 
and m.adapter = 'PE' 
group by m.related_name, w.status

This is untested but should work i believe :)

Upvotes: 6

Related Questions