Spag-bol
Spag-bol

Reputation: 35

SQL COUNT returning wrong value with GROUP_CONCAT

I'm currently writing a query to search over three tables (invoice, job_lines and item_records). job_lines and item_records both have multiple entries per invoice.id so I'm trying to combine them into one line per id.

It seems to be working so far but the count(*) is giving back the number of item_records.MAWB/item_records.HAWB per line rather than the overall number of grouped entries.

is there a way to get the actual total number of groups?

SELECT      

        COUNT(*), 
        invoice.id,   
        IFNULL(GROUP_CONCAT(DISTINCT job_lines.job_number SEPARATOR ' '),''),   
        IFNULL(GROUP_CONCAT(DISTINCT item_records.MAWB SEPARATOR ' '),''),   
        IFNULL(GROUP_CONCAT(DISTINCT item_records.HAWB SEPARATOR ' '),'')    
FROM      

        invoice   
LEFT JOIN job_lines ON invoice.id=job_lines.taken   
LEFT JOIN item_records ON invoice.job_number=item_records.job_number  
GROUP BY

        invoice.id   

Example data:

Id Job Number HAWB MAWB
1| 200001 | a1,a2,a3 | b2,b3
2| 200002 | a7, a8, | b1
3| 255544 | a11 | b4

Im after the total number of ID's in this case, but im getting the total number of HAWB/MAWB from each line.

Upvotes: 1

Views: 641

Answers (2)

GarethD
GarethD

Reputation: 69759

You need to perform your aggregates in subqueries, what you are currently doing is causing a cross join, so if there are 2 records in job_lines and 2 records in item_records for a particular invoice id, you end up with 4 rows covering all 4 combinations.

SELECT  jl.Count + ir.Count AS Count,
        i.id,
        IFNULL(jl.JobLines, ''),
        IFNULL(ir.MAWB, '') AS MAWB AS JobLines,
        IFNULL(ir.HAWB, '') AS HAWB
FROM    Invoice AS i
        LEFT JOIN
        (   SELECT  taken, 
                    GROUP_CONCAT(job_number SEPARATOR ' ') AS JobLines,
                    COUNT(*) AS Count
            FROM    job_lines
            GROUP BY taken
        ) AS jl
            ON jl.taken = i.id
        LEFT JOIN
        (   SELECT  job_number, 
                    GROUP_CONCAT(item_records.MAWB SEPARATOR ' ') AS MAWB,
                    GROUP_CONCAT(item_records.HAWB SEPARATOR ' ') AS HAWB,
                    COUNT(*) AS Count
            FROM    item_records
            GROUP BY job_number
        ) AS ir
            ON ir.job_number = i.job_number;

N.B. MySQL Materialises subqueries, i.e. will perform each of the subqueries once when they are first referenced, and store the results in a hashed temporary table and look up from that each time. So if you were to have a WHERE clause that that meant only a limited number of invoices were returned, it would be inefficient to summarise the entire job_lines table into a temporary table, only to use a few records. If this is the case then Thorsten's answer will be more efficient, however if you are querying a large amount of the invoice table, then not having to use DISTINCT to undo the cross join should prove more efficient.

Upvotes: 1

Thorsten Kettner
Thorsten Kettner

Reputation: 94914

You count joined records per invoice.id . So if you have for an invoice id 5 job lines and 3 item records, you get 5 x 3 = 15.

If you want just the number of job lines, use COUNT(DISTINCT job_lines.id).

If you want just the number of item records, use COUNT(DISTINCT item_records.id).

Upvotes: 1

Related Questions