mynameisbear
mynameisbear

Reputation: 23

Selecting multiple fields from row with max value of column, per group

I'm quite certain I've painted myself into a corner and I can't figure my way out.

The Users table and OrderHistories tables both have 1+ million records:

SELECT
    u.Id ,
    u.Email AS EmailAddress ,
    c.Address_Address1 AS "Address 1" ,
    (
      SELECT
            COUNT(*)
        FROM
            dbo.OrderHistories oh
        WHERE
            oh.UserId = u.UserName
    ) AS NumberOfOrders ,
    Carts.SubtotalAmount AS CartTotal ,
    (
      SELECT
            MAX(oh.CreateDate)
        FROM
            dbo.OrderHistories AS oh
        WHERE
            oh.UserId = u.Id
    ) AS LastOrderDate ,
    (
      SELECT
            LastOrders.SubtotalAmount AS LastOrderSubtotal
        FROM
            (
              SELECT
                    UserId ,
                    CreateDate ,
                    SubtotalAmount ,
                    MAX(CreateDate) OVER ( PARTITION BY UserId ) MyLastOrderDate
                FROM
                    Users u
                    INNER JOIN dbo.OrderHistories oh
                    ON u.Id = oh.UserId
            ) AS LastOrders
        WHERE
            LastOrders.MyLastOrderDate = LastOrders.CreateDate
            AND LastOrders.UserId = u.Id
    ) AS LastOrderSubtotal
FROM
    Users u
    INNER JOIN Customers AS c
    ON u.Id = c.Id
    LEFT JOIN dbo.Carts
    ON c.Id = Carts.CustomerId

This particular subquery is my current problem (EXTREMELY inefficient), but I'm not experienced enough to understand exactly why, or how I should be doing it instead (I can't get there from here!):

(
      SELECT
            LastOrders.SubtotalAmount AS LastOrderSubtotal
        FROM
            (
              SELECT
                    UserId ,
                    CreateDate ,
                    SubtotalAmount ,
                    MAX(CreateDate) OVER ( PARTITION BY UserId ) MyLastOrderDate
                FROM
                    Users u
                    INNER JOIN dbo.OrderHistories oh
                    ON u.Id = oh.UserId
            ) AS LastOrders
        WHERE
            LastOrders.MyLastOrderDate = LastOrders.CreateDate
            AND LastOrders.UserId = u.Id
    ) AS LastOrderSubtotal

Anyone mind telling me how terrible I am and then segue right into a suggested improvement?

Upvotes: 1

Views: 43

Answers (1)

SqlZim
SqlZim

Reputation: 38063

Just from looking at your query, you may be able to simplify it using cross apply() like so:

select
    u.Id
  , EmailAddress = u.Email 
  , [Address 1]  = c.Address_Address1
  , CartTotal    = Carts.SubtotalAmount
  , NumberOfOrders = oh.NumberOfOrders
  , LastOrderDate  = oh.CreateDate
  , LastOrderSubtotal = oh.SubtotalAmount
from Users u
  inner join Customers AS c
    on u.Id = c.Id
  left join dbo.Carts
    on c.Id = Carts.CustomerId
  cross apply (
    select top 1
        i.CreateDate
      , i.SubtotalAmount
      , NumberOfOrders = count(*) over (partition by i.UserId) 
   from dbo.OrderHistories i
   where i.UserId = u.Id
   order by i.CreateDate desc
   ) as oh

If you want rows that may not have an OrderHistory, switch to outer apply().

Reference:

Upvotes: 1

Related Questions