Nazmul Hossain
Nazmul Hossain

Reputation: 2085

SQL Query returns different results in two different query

I am trying to update old SQL query which returns 1020 rows of data; the new query however returns 1864 rows. I have tried both left join/inner join. Thanks.

Old query (1020 rows)

select 
    cl.clid as clid_, cl.id as hlid_, holdinNo, 
    holding,ClientID as clientid_, ClientName as clientName_  
from 
    tx_holding as cl 
where 
    cl.status = 1 
    and cl.roadno = '001'
    and cl.id in (select hlid from tx_asset where asset is not null) 
    and cl.clid in (select id from tbl_client where client_type = 'Non-Govt.') 
    and cl.id not in (select hlid from tx_bill_pay 
                      where year(date_month) = 2019 and hlid is not null 
                      group by hlid)

New query (1864 rows)

select * 
from  
    (select 
         cl.clid, cl.id, cl.holdinNo,  cl.holding,
         cl.ClientID as clientid_, cl.ClientName as clientName_  
     from 
         tx_holding as cl 
     where 
         cl.status = 1 
         and cl.roadno = '001') AS cl
inner join
    (select clid, hlid 
     from tx_asset 
     where asset is not null) as A on Cl.id = A.hlid 
inner join 
    (select cellNo, id 
     from tbl_client 
     where client_type = 'Non-Govt.') as C on cl.clid = C.id
where
    cl.id not in (select hlid
                  from tx_bill_pay
                  where year(date_month) = 2019  
                    and hlid is not null 
                  group by hlid) 

Upvotes: 0

Views: 774

Answers (2)

dbbri
dbbri

Reputation: 224

It is probably in your left joins.

Going from having an instance (in clause) to joining can do this if the join returns multiple records of the same join criteria. I'd start with removing the unused criteria and adding a distinct to the selects. If this solves the issue, I'd then remove the distncts one at a time to see where the duplication is. I'd then change the select from a distinct to a group by. I'd also redo the where clause, a not in a set of ID's is a terrible way to filter.

LEFT JOIN
    (
        SELECT DISTINCT hlid
        FROM tx_asset
        WHERE asset IS NOT NULL
    ) AS A
    ON Cl.id = A.hlid

LEFT JOIN
    (
        SELECT DISTINCT id
        FROM tbl_client WHERE client_type = 'Non-Govt.'
    ) AS C
    ON cl.clid=C.id
WHERE cl.id NOT IN
(
    SELECT hlid
    FROM tx_bill_pay
    WHERE
        year(date_month) = 2019 AND
        hlid IS NOT NULL
    GROUP BY hlid
)

Upvotes: 1

VGuy
VGuy

Reputation: 31

If I'm not mistaken you need your new iteration of your query to return the exact same results as your old query (both need to be 1020).

However, the structure of your new query is considering tables that were not previously considered as join (inner or left), so you might have a case of data duplicity due to insufficient primary key bindings.

1) In your original query you're only sweeping through "tx_holding" and creating no new joins. Your constraint statements are earching for data in different tables but aren't directly joined to your base table and since they return a single field, the IN statement works fine.

2) In your new query, you are only binding one field from "tx_asset" to your base table, so most likely this table has a compound primary key (2 or more fields as key fields), so you'd get something like this:

TX_HOLDING (CL)
ID
1
2
3
4

TX_ASSET (A)
HLID CLID
1 1
1 2
2 1
2 2
3 1

Since you're only joining the tables on a single field (CL.ID = A.HLID), this creates duplicity because you have 2 records that are equal to 1, but have a different secondary primary key.

One possible solution would be to add another ON statement that binds CL.CLID = A.CLID, if these 2 fields match your table structure of course; also consider you are invoking it in your subquery but not actually using it. This would extend to your sencond join with "tbl_client".

Upvotes: 1

Related Questions