Reputation: 109
The problem i have is like this. I need to create a view that has the following information on it
Order Number, Order Line, Packing Slip Number, Invoice Number
There is more information but the above 4 are the ones that are the problem.
The system as it exists has no connection from packing slip to invoice but both packing slips and invoices are connected to order lines.
So lets say I have this information in the packing slip table
Order1 Line1 PackingSlip1 20
Order1 Line1 PackingSlip2 30
Order1 Line2 PackingSlip3 25
Order1 Line2 PackingSlip4 25
And this information in the invoice table
Order1 Line1 Invoice1 20
Order1 Line1 Invoice2 30
Order1 Line2 Invoice3 25
Order1 Line2 Invoice4 25
There are basically 2 scenarios here. The first one is easy because we can join the PackingSlip table to the InvoiceTable on quantities, order numbers and order lines. The issue is that if we do that the 2nd scenario (where the quantity is the same) would have 4 lines as output instead of just 2.
So my question is. How can I join these 2 tables together so that I match the order number and line. And then find the first matching quantity and join on that row. So I would want PackingSlip3 to match to Invoice3 and PackingSlip4 to match to Invoice4.
As additional information the packing table has a packing date and the invoice table has an invoice date. So I would like to join the earliest packed order,line to the earliest invoice order,line.
Any help would be greatly appreciated. Thank you
Edit. here is some sample code for the time being
Declare @InvoiceTable as Table(
OrderNumber nvarchar(10),
OrderLine int,
InvoiceNumber nvarchar(10),
QuantityInvoiced int
)
Declare @PackingTable as Table(
OrderNumber nvarchar(10),
OrderLine int,
PackingNumber nvarchar(10),
QuantityPacked int
)
Insert into @PackingTable(OrderNumber, OrderLine, PackingNumber, QuantityPacked) Values ('O1', 1, 'P1', 20),('O1', 1, 'P2', 30), ('O1', 2, 'P3', 25), ('O1', 2, 'P4', 25)
Insert into @InvoiceTable(OrderNumber, OrderLine, InvoiceNumber, QuantityInvoiced) Values ('O1', 1, 'I1', 20),('O1', 1, 'I2', 30), ('O1', 2, 'I3', 25), ('O1', 2, 'I4', 25);
with ctePackingSlip as
(
select distinct OrderNumber, OrderLine, PackingNumber, QuantityPacked
from @PackingTable
), cteInvoice as
(
select distinct OrderNumber, OrderLine, InvoiceNumber, QuantityInvoiced
from @InvoiceTable
)
SELECT t0.OrderNumber, t0.OrderLine, PackingNumber, InvoiceNumber, t0.QuantityPacked
FROM ctePackingSlip As t0
JOIN @InvoiceTable As t1
ON t0.OrderNumber = t1.OrderNumber
AND t0.OrderLine = t1.OrderLine
AND t0.QuantityPacked = t1.QuantityInvoiced
The problem is this outputs the following
O1 1 P1 I1 20
O1 1 P2 I2 30
O1 2 P3 I3 25
O1 2 P4 I3 25
O1 2 P3 I4 25
O1 2 P4 I4 25
As you can see at the end there is 2 lines for P3 and P4. I only want 1 line for each
Upvotes: 2
Views: 72
Reputation: 10277
This is very important:
As additional information the packing table has a packing date and the invoice table has an invoice date. So I would like to join the earliest packed order,line to the earliest invoice order,line.
Assign the rows in each group a row number based on the packing/invoice dates, then add that to the join:
with ctePacking as
(
SELECT OrderNumber, OrderLine, PackingSlipNumber, Quantity,
ROW_NUMBER() OVER (PARTITION BY OrderNumber, OrderLine ORDER BY PackingDate ASC) RN
FROM PackingSlip
)
, cteInvoice as
(
SELECT OrderNumber, OrderLine, InvoiceNumber, Quantity,
ROW_NUMBER() OVER (PARTITION BY OrderNumber, OrderLine ORDER BY InvoiceDate ASC) RN
FROM Invoice
)
SELECT P.OrderNumber, P.OrderLine, P.PackingSlipNumber, I.InvoiceNumber
FROM ctePacking P
JOIN cteInvoice I ON P.OrderNumber = I.OrderNumber
AND P.OrderLine = I.OrderLine
AND P.RN = I.RN
Edit - Some explanation:
The idea here is that you know the times of the entries relate to each other (first packing slip entry = first invoice entry), but I assume those times can vary slightly so you can't join directly on them.
To solve this, we can number them by their ordered time, then join on that number.
ROW_NUMBER()
is a window function that will assign a number to every row (as a new column) that we can then join on. The basic syntax is this:
ROW_NUMBER() OVER (PARTITION BY columnsInGroup ORDER BY columnsToOrderBy)
See here for more information on ROW_NUMBER()
.
In your case, we want to order by the datetime, and group by each order number and order line. If orders are always inserted perfectly one after the other, we could maybe remove the partition grouping altogether, but it may be safer to have, so:
ROW_NUMBER() OVER (PARTITION BY OrderNumber, OrderLine ORDER BY OrderDate) RN --RN is an alias for the new column
Now the easiest way to implement adding a row number column for us to reference in the main query is to use a CTE. This lets us create the column, alias it, and already have it available for the final query. An alternative option would be to use derived tables like this:
SELECT P.OrderNumber, P.OrderLine, P.PackingSlipNumber, I.InvoiceNumber
FROM (SELECT OrderNumber, OrderLine, PackingSlipNumber, Quantity,
ROW_NUMBER() OVER (PARTITION BY OrderNumber, OrderLine ORDER BY PackingDate ASC) RN
FROM PackingSlip) P
JOIN (SELECT OrderNumber, OrderLine, InvoiceNumber, Quantity,
ROW_NUMBER() OVER (PARTITION BY OrderNumber, OrderLine ORDER BY InvoiceDate ASC) RN
FROM Invoice) I ON P.OrderNumber = I.OrderNumber
AND P.OrderLine = I.OrderLine
AND P.RN = I.RN
For most people, CTEs are a much cleaner way to separate the logic and leave your final query simpler.
So to sum it up, we are using the CTEs to add row numbers to the base table data, then we are querying the CTEs instead of the base tables to access and use the new RN column in our join.
Upvotes: 2