Reputation: 11548
I have a table in this format:
PID TIV_2011 TIV_2012 LAT LON
123 1000 1200 20.123489 80.341245
456 1500 3000 21.341287 80.341245
789 2000 1500 21.341287 80.341245
321 1000 750 21.123641 80.238716
567 1500 2300 22.123641 80.238716
Question: Write a query to print the sum of all total investment values in 2012 (TIV_2012), to a scale of 2 decimal places, for all policyholders who meet the following criteria:
Have the same TIV_2011 value as one or more other policyholders.
Are not located in the same city as another policyholder (i.e.: the latitude, longitude) attribute pair must be unique).
My Query:
select cast(sum(i1.tiv_2012) as decimal(12,2))
from insurance i1 join insurance i2 on i1.tiv_2011 = i2.tiv_2011
where i1.pid != i2.pid
and i1.lat != i2.lat
and i1.lon != i2.lon
Answer is 7250 which is 1200 + 3000 + 750 + 2300.
But the actual answer should be 4250 which is 1200 + 750 + 2300
This question has been answered before and the query that works fine is:
select cast(sum(i1.tiv_2012) as decimal(12,2)) from insurance i1
join (select TIV_2011 from insurance group by TIV_2011 having count(*) > 1) i2 on i1.tiv_2011 = i2.TIV_2011
join (select lat,LON from insurance group by LAT,LON having COUNT(*) = 1) i3 on i1.LAT = i3.LAT and i1.LON = i3.LON
But can someone please let me know where or why my joins are going wrong?
Upvotes: 0
Views: 208
Reputation: 239824
Prefer to write more straightforward SQL that reflects the requirements. If you want to do this via joins it can quickly get out of hand because the joins will cross multiply with each other and inflate the number of rows. In addition, the two numbered clauses require different conditions to express, with the second only doable with a LEFT JOIN
that you're hoping will fail.
I'd prefer to use two EXISTS
checks:
declare @t table (PID int, TIV_2011 int, TIV_2012 int, Lat decimal(8,6),Lon decimal(9,6))
insert into @t(PID,TIV_2011,TIV_2012,LAT,LON) values
(123,1000,1200,20.123489,80.341245),
(456,1500,3000,21.341287,80.341245),
(789,2000,1500,21.341287,80.341245),
(321,1000, 750,21.123641,80.238716),
(567,1500,2300,22.123641,80.238716)
select
CONVERT(decimal(12,2),SUM(TIV_2012))
from
@t t_keep
where
exists (select * from @t t_other2011
where t_keep.PID != t_other2011.PID
and t_keep.TIV_2011 = t_other2011.TIV_2011
) and
not exists (select * from @t t_city
where t_keep.PID != t_city.PID and
t_keep.Lat = t_city.Lat and
t_keep.Lon = t_city.Lon)
Select rows from @t
(t_keep
) where (first EXISTS
) there is another row in the table for another policyholder where we share the same TIV_2011
value. But also (and
) where (second EXISTS
) there isn't a row in the table for another policyholder where we share the same Lat
and Lon
values.
This produces the 4250 value.
Upvotes: 2
Reputation: 95101
This is no task where I would join at all.
select sum(tiv_2012)
from insurance i
where exists
(
select *
from mytable other
where other.pid <> i.pid
and other.tiv_2011 = i.tiv_2011
)
and not exists
(
select *
from mytable other
where other.pid <> i.pid
and other.lat = i.lat
and other.lon = i.lon
);
I know this doesn't answer your question what is exactly wrong with your joins. In my opinion it's the approach to even use joins here, when you only want to look up other rows in the table. For looking up records we have EXISTS
and IN
and criteria goes to the WHERE
clause as it should.
Upvotes: 1
Reputation: 2191
You have two independent conditions but you try to use it together. Split them out using, for instance, a subquery and you will get expected result.
DECLARE @Test TABLE (
PID int,
TIV_2011 int,
TIV_2012 int,
LAT DECIMAL(8,6),
LON DECIMAL(8,6)
)
INSERT @Test(PID, TIV_2011, TIV_2012, LAT, LON)
VALUES
(123, 1000, 1200, 20.123489, 80.341245),
(456, 1500, 3000, 21.341287, 80.341245),
(789, 2000, 1500, 21.341287, 80.341245),
(321, 1000, 750, 21.123641, 80.238716),
(567, 1500, 2300, 22.123641, 80.238716)
select cast(sum(i1.tiv_2012) as decimal(12,2))
from @Test i1
join @Test i2 on i1.tiv_2011 = i2.tiv_2011
AND i1.pid != i2.pid
where
NOT EXISTS (
SELECT 1
FROM @Test
WHERE LAT = i1.lat
and LON = i1.lon
AND PID != i1.pid
)
Upvotes: 1
Reputation: 392
your this line is different and doesn't make sense.
i1.pid != i2.pid
That doesn't need to be there.
Upvotes: 0