Reputation: 35
I'm currently writing a query to search over three tables (invoice, job_lines and item_records). job_lines and item_records both have multiple entries per invoice.id so I'm trying to combine them into one line per id.
It seems to be working so far but the count(*) is giving back the number of item_records.MAWB/item_records.HAWB per line rather than the overall number of grouped entries.
is there a way to get the actual total number of groups?
SELECT
COUNT(*),
invoice.id,
IFNULL(GROUP_CONCAT(DISTINCT job_lines.job_number SEPARATOR ' '),''),
IFNULL(GROUP_CONCAT(DISTINCT item_records.MAWB SEPARATOR ' '),''),
IFNULL(GROUP_CONCAT(DISTINCT item_records.HAWB SEPARATOR ' '),'')
FROM
invoice
LEFT JOIN job_lines ON invoice.id=job_lines.taken
LEFT JOIN item_records ON invoice.job_number=item_records.job_number
GROUP BY
invoice.id
Example data:
Id Job Number HAWB MAWB
1| 200001 | a1,a2,a3 | b2,b3
2| 200002 | a7, a8, | b1
3| 255544 | a11 | b4
Im after the total number of ID's in this case, but im getting the total number of HAWB/MAWB from each line.
Upvotes: 1
Views: 641
Reputation: 69759
You need to perform your aggregates in subqueries, what you are currently doing is causing a cross join, so if there are 2 records in job_lines
and 2 records in item_records
for a particular invoice id, you end up with 4 rows covering all 4 combinations.
SELECT jl.Count + ir.Count AS Count,
i.id,
IFNULL(jl.JobLines, ''),
IFNULL(ir.MAWB, '') AS MAWB AS JobLines,
IFNULL(ir.HAWB, '') AS HAWB
FROM Invoice AS i
LEFT JOIN
( SELECT taken,
GROUP_CONCAT(job_number SEPARATOR ' ') AS JobLines,
COUNT(*) AS Count
FROM job_lines
GROUP BY taken
) AS jl
ON jl.taken = i.id
LEFT JOIN
( SELECT job_number,
GROUP_CONCAT(item_records.MAWB SEPARATOR ' ') AS MAWB,
GROUP_CONCAT(item_records.HAWB SEPARATOR ' ') AS HAWB,
COUNT(*) AS Count
FROM item_records
GROUP BY job_number
) AS ir
ON ir.job_number = i.job_number;
N.B. MySQL Materialises subqueries, i.e. will perform each of the subqueries once when they are first referenced, and store the results in a hashed temporary table and look up from that each time. So if you were to have a WHERE clause that that meant only a limited number of invoices were returned, it would be inefficient to summarise the entire job_lines
table into a temporary table, only to use a few records. If this is the case then Thorsten's answer will be more efficient, however if you are querying a large amount of the invoice table, then not having to use DISTINCT to undo the cross join should prove more efficient.
Upvotes: 1
Reputation: 94914
You count joined records per invoice.id . So if you have for an invoice id 5 job lines and 3 item records, you get 5 x 3 = 15.
If you want just the number of job lines, use COUNT(DISTINCT job_lines.id)
.
If you want just the number of item records, use COUNT(DISTINCT item_records.id)
.
Upvotes: 1