Ryan Gadsdon
Ryan Gadsdon

Reputation: 2378

JOIN results give duplicate rows - SQL Server

I am currently dealing with a database that is not set up the way i think it should be. There are barely any primary/foreign keys so joining tables can be quite tricky.

I have tried to recreate a similar issue to the one i have where joins that i create result in incorrect duplicate rows which is a result of the lack of keys.

sql fiddle - http://sqlfiddle.com/#!9/ff59ad/1

Query:

SELECT I.InvoiceNumber, I.AmountPaid,
O.Amount
FROM OrderInvoice O
JOIN Invoice I 
ON O.InvoiceNumber = I.InvoiceNumber

Now as you can see there is duplicate entrys for invoice number

InvoiceNumber   AmountPaid  Amount
    123           10          8
    123           10          2
    567           10          4
    567           10          6

The total amount paid should just be 10 for each whereas the join creates 20 for each because of the matching rows. So if i group these results and sum them the total amount is wrong. Its slightly different from my real life example as it looks like i am actually getting a cross join even with normal join syntax because of the duplicated rows.

Whats the best way to remove these duplicated rows? I have tried joining on multiple columns to create a unique row but i have no other columns to join on.

Can anyone give me advice?

Upvotes: 1

Views: 10348

Answers (6)

Nick.Mc
Nick.Mc

Reputation: 19235

I think that maybe you're after this:

SELECT 
I.InvoiceNumber, 
I.AmountPaid,
L.LineTotal
FROM Invoice O
JOIN 
(
SELECT O.InvoiceNumber, SUM(O.Amount) As LineTotal
FROM OrderInvoice O
GROUP BY O.InvoiceNumber
) As L
ON L.InvoiceNumber = I.InvoiceNumber

This pre-aggregates the lines into a single row per invoice number with a total, then joins it to the header.

This gives you a header level figure.

If you want a line level figure you need to work out what to do with the total - put it on the first line? - put it on the last line? - divide it over the line?

None of these really make any sense though

Upvotes: 0

user359135
user359135

Reputation:

When you join table a to table b you will get one row for in your result for every match.

Io if table b has two rows that match table a then the data from table a will be duplicated. once for the first match and once for the second match.

in your case there are two rows in the OrderInvoice table that match one row in the Invoice table. the AmountPaid is from the Invoice table so it will be duplicated as the single row in Invoice match two rows in OrderInvoice.

Upvotes: 0

Yogesh Sharma
Yogesh Sharma

Reputation: 50173

Just add group by clause with sum aggregation function

SELECT I.InvoiceNumber, I.AmountPaid,
sum(O.Amount) Amount
FROM OrderInvoice O
JOIN Invoice I 
ON O.InvoiceNumber = I.InvoiceNumber group by I.InvoiceNumber, I.AmountPaid

Result :

InvoiceNumber   AmountPaid  Amount
123             10          10
567             10          10

Upvotes: 4

Valerica
Valerica

Reputation: 1630

It is ok to be like this, 'cause you have different amount values

If you SUM the amounts should be ok

SELECT I.InvoiceNumber, I.AmountPaid,
SUM(O.Amount)
FROM OrderInvoice O
JOIN Invoice I 
ON O.InvoiceNumber = I.InvoiceNumber
GROUP BY I.InvoiceNumber,I.AmountPaid

Upvotes: 0

Monofuse
Monofuse

Reputation: 827

If the amount paid is static across all rows then you need to group by that as well, not sum by it.

SELECT I.InvoiceNumber, I.AmountPaid, SUM(O.Amount)
FROM OrderInvoice O
JOIN Invoice I ON O.InvoiceNumber = I.InvoiceNumber
GROUP BY I.InvoiceNumber, I.AmountPaid

Upvotes: 0

gotqn
gotqn

Reputation: 43666

Try this:

SELECT DISTINCT I.InvoiceNumber, I.AmountPaid
FROM OrderInvoice O
JOIN Invoice I 
ON O.InvoiceNumber = I.InvoiceNumber

You have more then one record per InvoiceNumber in the second table - that's why you are getting multiple records in the final set.

Upvotes: 0

Related Questions