Reputation: 2085
I am trying to update old SQL query which returns 1020 rows of data; the new query however returns 1864 rows. I have tried both left join/inner join. Thanks.
Old query (1020 rows)
select
cl.clid as clid_, cl.id as hlid_, holdinNo,
holding,ClientID as clientid_, ClientName as clientName_
from
tx_holding as cl
where
cl.status = 1
and cl.roadno = '001'
and cl.id in (select hlid from tx_asset where asset is not null)
and cl.clid in (select id from tbl_client where client_type = 'Non-Govt.')
and cl.id not in (select hlid from tx_bill_pay
where year(date_month) = 2019 and hlid is not null
group by hlid)
New query (1864 rows)
select *
from
(select
cl.clid, cl.id, cl.holdinNo, cl.holding,
cl.ClientID as clientid_, cl.ClientName as clientName_
from
tx_holding as cl
where
cl.status = 1
and cl.roadno = '001') AS cl
inner join
(select clid, hlid
from tx_asset
where asset is not null) as A on Cl.id = A.hlid
inner join
(select cellNo, id
from tbl_client
where client_type = 'Non-Govt.') as C on cl.clid = C.id
where
cl.id not in (select hlid
from tx_bill_pay
where year(date_month) = 2019
and hlid is not null
group by hlid)
Upvotes: 0
Views: 774
Reputation: 224
It is probably in your left joins.
Going from having an instance (in clause) to joining can do this if the join returns multiple records of the same join criteria. I'd start with removing the unused criteria and adding a distinct to the selects. If this solves the issue, I'd then remove the distncts one at a time to see where the duplication is. I'd then change the select from a distinct to a group by. I'd also redo the where clause, a not in a set of ID's is a terrible way to filter.
LEFT JOIN
(
SELECT DISTINCT hlid
FROM tx_asset
WHERE asset IS NOT NULL
) AS A
ON Cl.id = A.hlid
LEFT JOIN
(
SELECT DISTINCT id
FROM tbl_client WHERE client_type = 'Non-Govt.'
) AS C
ON cl.clid=C.id
WHERE cl.id NOT IN
(
SELECT hlid
FROM tx_bill_pay
WHERE
year(date_month) = 2019 AND
hlid IS NOT NULL
GROUP BY hlid
)
Upvotes: 1
Reputation: 31
If I'm not mistaken you need your new iteration of your query to return the exact same results as your old query (both need to be 1020).
However, the structure of your new query is considering tables that were not previously considered as join (inner or left), so you might have a case of data duplicity due to insufficient primary key bindings.
1) In your original query you're only sweeping through "tx_holding" and creating no new joins. Your constraint statements are earching for data in different tables but aren't directly joined to your base table and since they return a single field, the IN statement works fine.
2) In your new query, you are only binding one field from "tx_asset" to your base table, so most likely this table has a compound primary key (2 or more fields as key fields), so you'd get something like this:
TX_HOLDING (CL)
ID
1
2
3
4
TX_ASSET (A)
HLID CLID
1 1
1 2
2 1
2 2
3 1
Since you're only joining the tables on a single field (CL.ID = A.HLID), this creates duplicity because you have 2 records that are equal to 1, but have a different secondary primary key.
One possible solution would be to add another ON statement that binds CL.CLID = A.CLID, if these 2 fields match your table structure of course; also consider you are invoking it in your subquery but not actually using it. This would extend to your sencond join with "tbl_client".
Upvotes: 1