Reputation: 1389
I have a SQL Server stored procedure that doesn't give me all the results when I add in the union and the second half. The first half will give me all 6 results, using the union, I only get 5.
There are 3 patients with their own [chart number], each have 2 items that should be displayed. if the [CPTCode] is the same for both entries of a patient, only one of the two entries show up when i add the union (it gives me all 6 with the same [cptcode] without the union). The second half isn't actually pulling any information with what i'm doing right now, but it is needed.
select /*Patients with chart numbers*/
B1.[chart Number],
'0' as newPatient,
isnull(Lytec.[Last Name],'') as [Last Name],
isnull(Lytec.[First Name],'') as [First Name],
isnull(Address.[Name],
Lytec.[Facility Code]) as [Facility],
isnull(B1.DOS,'') as [DOS],
isnull(Ins.[Name],
Lytec.[Primary Code]) as [Primary Code],
isnull(B1.[CPTCode],'') as cptCode,
isnull(B1.[Units],'') as [Units],
isnull(B1.[Modifiers],'') as [Modifiers],
isnull(B1.[cptCodeOther],'') as [cptCodeOther],
isnull(B1.[cptCode2],'') as cptCode2,
isnull(B1.[Units],'') as [Units2],
isnull(B1.[Modifiers2],'') as [Modifiers2],
isnull(B1.[cptCodeOther2],'') as [cptCodeOther2],
'name' as ProviderName
from
[sandboxmr].dbo.patient Lytec
left join [dbo].[Billing] B1 on B1.[Chart Number] = Lytec.[Chart Number]
left join [sandboxmr].dbo.[Address] Address on Lytec.[Facility Code] = Address.[Code]
left join [sandboxmr].dbo.[Insurance] Ins on Lytec.[Primary Code] = Ins.[Code]
where
b1.[userid] = 54
and (b1.[DateSubmitted] >= convert(varchar(25),'2011-8-31',101)
and b1.[DateSubmitted] <= convert(varchar(25),'2011-9-22',101))
union
select /*Patients without chart numbers*/
cast(P.id as varchar(15)) as [chart number],
'1' as newPatient,
isnull(P.[Last Name],'') as [Last Name],
isnull(P.[First Name],'') as [First Name],
isnull(Address.[Name],
P.[Facility Code]) as [Facility],
isnull(IV.DOS,isnull(SV.DOS,'')) as [DOS],
isnull(Ins.[Name],P.[Primary_Code]) as [Primary Code],
isnull(IV.[cptCode],isnull(SV.[cptCode],'')) as cptCode,
isnull(IV.[Units],isnull(SV.[Units],'')) as [Units],
isnull(IV.[Modifiers],isnull(SV.[Modifiers],'')) as [Modifiers],
isnull(IV.[cptcodeother],isnull(SV.[cptcodeother],'')) as [cptCodeOther],
isnull(IV.[cptCode2],isnull(SV.[cptCode2],'')) as cptCode2,
isnull(IV.Units2,isnull(SV.Units2,'')) as [Units2],
isnull(IV.[Modifiers2],isnull(SV.[Modifiers2],'')) as [Modifiers2],
isnull(IV.[cptCodeOther2],isnull(SV.[cptCodeOther2],'')) as [cptCodeOther2],
'Name' as ProviderName
from
[DNSList].[dbo].[Patient] P
left join [dbo].[InitialVisits] IV on p.emr_id = IV.patientid
left join [dbo].[SubsequentVisits] SV on p.emr_id = SV.patientid
left join [sandboxmr].dbo.[Address] Address on P.[Facility Code] = Address.[Code]
left join [sandboxmr].dbo.[Insurance] Ins on P.[Primary_Code] = Ins.[Code]
where
p.[userid] = 54
and p.[Chart Number] is null
and (p.[DateSubmitted] >= convert(varchar(25),'2011-8-31',101)
and p.[DateSubmitted] <= convert(varchar(25),'2011-9-22',101))
order by
[Last Name]
Why does it do this, and how can I fix it? I've tried adding a distinct
to the [cptcode] area, but it of course generates an error.
Thanks for any help you can provide!
Upvotes: 42
Views: 33352
Reputation: 16281
A little background here. Technically a table is a set of rows. This is in the mathematical sense of sets.
Two important properties of mathematical sets are:
The first point is why row order is not reliable unless you add an ORDER BY
clause. The second is one reason you should always have a primary key to ensure that each rows is indeed distinct.
The UNION
operation is a set operation and (a) combines two sets, (b) produces a set. In order to maintain the integrity of a set the UNION
will remove duplicate rows.
Unfortunately, there is no real way of guaranteeing whether two rows are supposed to be same, so SQL will simply compare the values in the SELECT
clauses. If those values are the same, then SQL decides there is a duplicate.
This can lead to an exaggerated example:
SELECT state FROM customers
UNION
SELECT state FROM customers;
In principle, the state
value is produced twice, duplicate values are removed, and what you have is a long-winded way of saying SELECT DISTINCT
.
The set behaviour is also why you can’t have an ORDER BY
clause in the SELECT
statements: once you order a data set, it is no longer a true set.
(Yes, you can have an ORDER BY
clause, but it isn’t attached to a SELECT
statement, but rather to the resulting UNION
).
Of course, SQL isn’t just about pure mathematics, so it includes two features which break the set:
You can order a set with an ORDER BY
clause.
However, you can only do this after the rest of the work is complete, which is why it’s the last clause.
You can include duplicates using the UNION ALL
clause.
Upvotes: 3
Reputation: 11433
I believe you need to use UNION ALL
. UNION
suppresses duplicates by default
Upvotes: 34
Reputation: 135799
UNION
will eliminate duplicate rows, whether they come between the two result sets or, as in your case, within just one result set. Try using UNION ALL
instead.
Upvotes: 107