Graeck
Graeck

Reputation: 1356

Find most recent record in a subquery (SQL Server)

I'm converting some code from Oracle to SQL Server (2012) and have run into an issue where this subquery is using a PARTITION/ORDER BY to retrieve the most recent record. The subquery runs fine on its own, but as it is a subquery, I'm getting the error:

SQL Server Database Error: The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.

Here's the section of SQL:

FROM (
  SELECT distinct enr.MemberNum,
    (ISNULL(enr.MemberFirstName, '') + ' ' + ISNULL(enr.MemberLastName, '')) AS MEMBER_NAME,
    enr.MemberBirthDate as DOB,
    enr.MemberGender as Gender,
    LAST_VALUE(enr.MemberCurrentAge) OVER (PARTITION BY MemberNum ORDER BY StaticDate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS AGE,
    LAST_VALUE(enr.EligStateAidCategory)OVER (PARTITION BY MemberNum ORDER BY StaticDate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS EligStateAidCategory,
    LAST_VALUE(enr.EligStateAidCategory)OVER (PARTITION BY MemberNum ORDER BY StaticDate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS AID_CAT_ROLL_UP,
    LAST_VALUE(enr.EligFinanceAidCategoryRollup)OVER (PARTITION BY MemberNum ORDER BY StaticDate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS EligFinanceAidCategoryRollup,
    SUM(enr.MemberMonth) OVER (PARTITION BY MemberNum) AS TOTAL_MEMBER_MONTHS
  FROM dv_Enrollment enr
  WHERE enr.StaticDate BETWEEN '01-JUN-2016' AND '30-JUN-2016'
)A

So, I've looked around and found that you can use the TOP (2147483647) hack, so I tried changing the first line to:

SELECT distinct TOP (2147483647) enr.MemberNum,

But I'm still getting the same error. All the other ways I've thought of also require an ORDER BY (using DENSE RANK, etc).

Upvotes: 0

Views: 3083

Answers (2)

SqlZim
SqlZim

Reputation: 38063

Switch this over to an aggregate subquery and cross apply() and see what happens.

select 
    e.MemberNum
  , e.MemberName
  , e.DOB
  , e.Gender
  , x.MemberCurrentAge
  , x.EligStateAidCategory
  , x.EligFinanceAidCategoryRollup
  , x.MemberMonth
  , e.Total_Member_Months
from (
    select
        enr.MemberNum
      , MemberName = isnull(enr.MemberFirstName+' ', '') + isnull(enr.MemberLastName, '')
      , DOB    = enr.MemberBirthDate
      , Gender = enr.MemberGender
      /* This sounds like a weird thing to sum */
      , Total_Member_Months = sum(enr.MemberMonth) 
    from dv_Enrollment enr
    group by 
        enr.MemberNum
      , isnull(enr.MemberFirstName+' ', '') + isnull(enr.MemberLastName, '')
      , enr.MemberBirthDate
      , enr.MemberGender
    ) as e
  /* cross apply() is like an inner join
   , use outer apply() for something like a left join */
  cross apply (
    select top 1
        i.MemberCurrentAge
      , i.EligStateAidCategory
      , i.EligFinanceAidCategoryRollup
      , i.MemberMonth
      from dv_Enrollment as i
      where i.MemberNum = e.MemberNum
        and i.StaticDate >= '20160601'
        and i.StatisDate <= '20160630'
      order by i.StaticDate desc -- descending for most recent
    ) as x

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270713

In both databases, I would write this like:

FROM (SELECT enr.MemberNum,
             (ISNULL(enr.MemberFirstName, '') + ' ' + ISNULL(enr.MemberLastName, '')) AS MEMBER_NAME,
             enr.MemberBirthDate as DOB,
             enr.MemberGender as Gender,
             MAX(CASE WHEN seqnum = 1 THEN enr.MemberCurrentAge END) AS AGE,
             MAX(CASE WHEN seqnum = 1 THEN enr.EligStateAidCategory END) AS EligStateAidCategory,
             MAX(CASE WHEN seqnum = 1 THEN enr.EligStateAidCategory END) AS AID_CAT_ROLL_UP,
             MAX(CASE WHEN seqnum = 1 THEN enr.EligFinanceAidCategoryRollup END) AS EligFinanceAidCategoryRollup,
            SUM(enr.MemberMonth) as TOTAL_MEMBER_MONTHS
    FROM (SELECT enr.*,
                 ROW_NUMBER() OVER (PARTITION BY MemberNum ORDER BY StaticDate DESC) as seqnum
          FROM dv_Enrollment enr
         ) enr
    WHERE enr.StaticDate >= DATE '2016-06-01' AND  -- DATE not needed in SQL Server
          enr.StaticDate < DATE '2016-07-01'       -- DATE not needed in SQL Server
    GROUP BY enr.MemberNum, enr.MemberFirstName, enr.MemberLastName,
             enr.MemberBirthDate, enr.MemberGender
   ) A

Why the changes?

  • The date changes are just to be careful about time components on the date. BETWEEN with date/times is a bad habit, because sometimes it can result in incorrect code and hard to debug errors.
  • I simply do not like using SELECT DISTINCT to mean GROUP BY. It is clever to use it with window functions (and necessary with LAST_VALUE()); but I think the code ends up being misleading.
  • I find the use of the subquery with seqnum to make it clear that the four "last value" variables are all pulling data from the last row.
  • In addition, it the sort is not stable (that is, the key is not unique), seqnum guarantees that the values are all from the same row. last_value() does not.

Upvotes: 2

Related Questions