Reputation: 69
I was extracting some data earlier and found my results too numerous to be of any use (over 4 million rows returned). I found the culprit to be a large handful of IDs that have several thousand records apiece.
Heres the structure of the data. I have two tables. tbl1 has a single record of every 'ATID' (an ATID being a scale id, basically). Each ATID is in essence its own table, though - it has a record for every item on the scale. tbl2 enumerates those items - it has a row for every scale item per ATID. So it has anywhere from 1 to 500,000 rows for each ATID (I know, pretty f'ing excessive).
I'd like to exclude the ATIDs which have over 100 rows. FYI this is a database I have read-only access to.
Here's my original query:
SELECT tbl1.ATID, tbl1.ATDesc, tbl2.AValue, tbl2.ADesc, tbl2.APosNeg
FROM tbl1 LEFT OUTER JOIN tbl2 ON (tb1.ATID = tbl2.ATID);
Here's the query I used to get a list of ATIDs and record counts in tbl2 per ATID:
select ATID, count(*) as row_count
from tbl2
group by ATID
order by row_count desc;
For my end product I'd like to return only ATIDs with fewer than 100 rows in tbl2. However, if I even try to add a WHERE clause to the second query using the row_count variable, it fails. So I have no idea how to even approach combining the two queries and getting the WHERE clause in.
The only alternative I have is to exclude ATIDs specifically when I know they have too many records, but there are too many to exclude (ie, using WHERE NOT ATID = 1016 AND NOT ATID = 554 AND NOT .... etc)
Upvotes: 2
Views: 233
Reputation: 6455
For the second query, the filters related to aggregate functions goes on a HAVING section
select ATID, count(*) as row_count
from tbl2
group by ATID
order by row_count desc;
having count(*) < 100
And for the first query you can simply use a subquery
SELECT tbl1.ATID, tbl1.ATDesc, tbl2.AValue, tbl2.ADesc, tbl2.APosNeg
FROM tbl1
LEFT OUTER JOIN tbl2 ON (tb1.ATID = tbl2.ATID)
WHERE tbl1.ATID in (select ATID
from tbl2
group by ATID
having count(*) < 100)
Upvotes: 1
Reputation: 93724
Here is one way to do it using
Count() Over()
window aggregate will count
the number of records for each ATID
in tbl2
table. Then count the can be used to filter the ATID
in Join
condition
SELECT tbl1.ATID,
tbl1.ATDesc,
tbl2.AValue,
tbl2.ADesc,
tbl2.APosNeg
FROM tbl1
LEFT OUTER JOIN (SELECT Count(1)OVER(partition BY ATID) AS cnt,*
FROM tbl2) tbl2
ON tb1.ATID = tbl2.ATID
AND tbl2.cnt <= 100;
Note : If you dont want all the ATID's
from tabl1
then move the tbl2.cnt <= 100
condition to Where
clause so you will get the count of ATID's
whose count is less than 100
in tbl2
Upvotes: 3
Reputation: 11205
For SQL server, use a CTE (faster than a sub-query). Also, unnecessary brackets in your query
with RowCo as
(
select ATID, count(*) as row_count
from tbl2
group by ATID
having row_count <= 100
)
SELECT tbl1.ATID, tbl1.ATDesc, tbl2.AValue, tbl2.ADesc, tbl2.APosNeg
FROM tbl1
inner join RowCo
on RowCo.ATID = tbl1.ATID
LEFT JOIN tbl2
ON tb1.ATID = tbl2.ATID
Upvotes: 1
Reputation: 31785
You've already done it, you just need to put your two queries together:
SELECT tbl1.ATID, tbl1.ATDesc, tbl2.AValue, tbl2.ADesc, tbl2.APosNeg
FROM tbl1 LEFT OUTER JOIN tbl2 ON (tb1.ATID = tbl2.ATID)
WHERE tbl1.ATID NOT IN (
select ATID
from tbl2
group by ATID
HAVING COUNT(*) >= 100
)
Upvotes: 3