mikebmassey
mikebmassey

Reputation: 8584

For Loop in SQL Server - is this the right logic?

I'm trying to take transactional data and cleanse it to meet my analysis needs. There are some limitations to how transactions are recorded into the database, and I am trying to get around those limitations.

When a customer places an order with more than 1 product, the transactional database doesn't link the multiple products together. Each product will have a unique sales ID, but there is no way to group multiple sales ID into 1 order. Here is a sample:

OrderID   MultOrderID   CustomerID  SalesDate   SalesTime   ProductID   ProductCost ShippingCost
6082346                 7661X0A     2012-06-12  959         105         99.99       7.95
6082347   5809812YY6Y   T891002     2012-06-12  1005        222         99.95       7.95
6082348   5809812YY6Z   T891002     2012-06-12  1005        273         22.95       1.00
6082349   5809812YY71   T891002     2012-06-12  1005        285         499.95      1.00
6082350   5809812YY72   T891002     2012-06-12  1005        172         49.95       1.00
6082351   5809812YY73   T891002     2012-06-12  1005        105         99.99       7.95
6082352   5809812YY74   X637251     2012-06-12  1010        285         499.95      7.95
6082353   5809812YY75   X637251     2012-06-12  1010        30          1024.99     1.00
6082354                 T512AT0     2012-06-12  1017        172         49.95       7.95

An additional limitation to this transaction system is that it can not ship more than 4 products together. If the customer places an order for 5 products, 4 products are shipped together (and charged 1 shipping charge), the remaining product is shipped separately and charged another shipping charge (yes, the overall business wants to rebuild this entire legacy system....).

What I am trying to determine is the number of products shipped per order, and the aggregate product costs and shipping costs.

If you look at the last 4 characters of the MultOrderID, you'll see that it's sequential, YY6Y becomes YY6Z, then rolls over to YY71, YY72. The logic is standardized - I know that if the CustomerID, SalesDate and SalesTime are the same, then I can pair off the products together. What I don't know is HOW I can accomplish this.

I believe the way to accomplish this is to break out the orders by CustomerID, SalesDate and SalesTime. Then, I get a for-loop, or something like that to cycle through the individual entries. Then, I look for the last 4 characters of the MultOrderID and say - If 1,2 and 3 are the same, and the 4th character is after the 4th character of the previous order, then pair it together, up to 4 orders. If the orderID is the 5th to 8th order in the range, then that's shipment 2, etc.

Can this be done in SQL Server? If not in that, what should I write this in? And is a for-loop what I should be using in this case?

Edit: Here is the output I am trying to get to. Keep in mind that after the 4th product was shipped, I need to restart the ordering (so, 6 products get broken into 2 shipments [4 products and 2 products], 9 products into 3 shipments [4, 4, and 1].

PRODUCTSSHIPPED SALESDATE   SALESTIME   CUSTOMERID  PRODUCTCOST SHIPPINGCOST
4               6/12/12         1005    T891002     672.8       10.95
1               6/12/12         1005    T891002     99.99       7.95
2               6/12/12         1010    X637251     1524.94     8.95
1               6/12/12         1017    T512AT0     49.95       7.95
1               6/12/12         959     7661X0A     99.99       7.95

Upvotes: 0

Views: 1855

Answers (4)

Charleh
Charleh

Reputation: 14012

Well from this statement it seems like you want this:

What I am trying to determine is the number of products shipped per order, and the aggregate product costs and shipping costs.

http://sqlfiddle.com/#!3/e0e71/30

So I'm not sure what you mean by using a foreach loop?

UPDATE:

Got it working using a subquery and ceiling function

Updated the fiddle

FYI SQL is:

SELECT
SalesDate,
SalesTime,
CustomerID,
SUM(ProductCost),
SUM(ShippingCost)
FROM 

(
SELECT
       SalesDate,
       SalesTime,
       CustomerID,
       ProductCost,
       ShippingCost,
       ROW_NUMBER() OVER (PARTITION BY salesdate, salestime, customerid ORDER BY CustomerID) as ProdNumber
FROM Orders
) as Summary
group by SalesDate, SalesTime, CustomerID, ceiling(ProdNumber / 4.0)

I used ROW_NUMBER to get a running count of products for each order, then made this a subquery so I could do the grouping. The grouping just used the number of products divided by 4 (as float) and uses the ceiling function to round up to the nearest int to get it grouping into 4

Upvotes: 2

xQbert
xQbert

Reputation: 35323

I'm not sure why a loop is needed at all..

Select count(*) as ProductsOnOrder, LEFT(CustomerID,4), as CID,
       SalesDate, SalesTime, sum(productCost), sum(ShippingCost)
FROM YOUR_TABLENAME
GROUP BY left(CustomerID,4), salesdate, salestime

What order number do you want to display? Min? Max? All of em? what? same question on products do you want to list the products or just count them?

Select count(*) as ProductsOnOrder, LEFT(CustomerID,4), as CID,
       SalesDate, SalesTime, sum(productCost), sum(ShippingCost),
       min(orderID), Max(orderID)
FROM YOUR_TABLENAME
GROUP BY left(CustomerID,4), salesdate, salestime

Since you know orderID is sequential for each line on the order you could return min/max and subtract the two as well to get a count.

Upvotes: 0

YavgenyP
YavgenyP

Reputation: 2123

I dont think you need a loop here. Usually its considered as a bad practice in sql, unless completly unavoidable. Can you assume that if an order is made by a user at the same datetime exactly it belongs to the same logical order (order group)? Anyway, the whole problem can probably be solved using SQL server's partition and over clauses. Look at sample D there, i think its doing something close to what you need.

EDIT
The range clause is availibale only in sql 2012, however you can still use partioning and rownumber, and then group by ur results by using simple calculation (ROWNUMBER / 4) on the returned rownumber

Upvotes: 1

Russell Fox
Russell Fox

Reputation: 5425

This should give you the number of orders for that customer/date/time in the NumOrders field. It uses my new favorite function, Row_Number:

SELECT [CUSTOMERID], [SALESDATE], [SALESTIME], MAX(NumOrders)
FROM (
    SELECT [CUSTOMERID], 
    [SALESDATE],
    [SALESTIME],
    ROW_NUMBER() OVER(PARTITION BY [CUSTOMERID], [SALESDATE], [SALESTIME] ORDER BY [CUSTOMERID]) AS NumOrders
) t1
GROUP BY [CUSTOMERID], [SALESDATE], [SALESTIME]

Upvotes: 1

Related Questions