SSRS group based on two columns

Question

I have a report that displays a list of duplicate accounts based on our business rules. This works when one new account is matched with other existing accounts. Where I'm having trouble is when multiple new accounts match the same existing duplicate. Here's an example of how it looks now, grouped by NewId:

NewID   MatchedID   FirstName   LastName    AddDate      Address        PhoneNumber
10      10          Holly       Johnson     4/18/2013    123 1St Rd.    123 456 7890
10      2           Hollie      Johnson     1/1/1990     123 1St Rd.    123 456 7890

11      11          Holley      Johnson     4/17/2013    123 1St Rd.    123-456-7890
11      2           Hollie      Johnson     1/1/1990     123 First Rd.  123 456 7890

50      50          William     Johnson     4/17/2013    999 2nd St.    222 222 2222
50      3           Bill        Jonson      1/2/1990     999 Second St. 222-222-2222

Accounts that have matches are themselves included for comparison.

So, is there a way to group these similar accounts together without duplicates? It should look like this:

GroupID  AcctID   FirstName   LastName    AddDate      Address        PhoneNumber
1        2        Hollie      Johnson     1/1/1990     123 First Rd.  123 456 7890
1        10       Holly       Johnson     4/18/2013    123 1St Rd.    123 456 7890
1        11       Holley      Johnson     4/17/2013    123 1St Rd.    123-456-7890
2        50       William     Johnson     4/17/2013    999 2nd St.    222 222 2222
2        3        Bill        Jonson      1/2/1990     999 Second St. 222-222-2222

I don't care if the grouping is done in SQL or in SSRS. It would need to reference the two ID columns, because the name, address, and phone number may be different. I also need a new GroupID assigned so that they can be grouped in the report.

Ian Preston · Accepted Answer

You can use ranking functions to eliminate rows:

with NoDuplicates as
(
  select *
    , rownum = row_number() over (partition by MatchedID order by NewID)
  from Accounts
)
select   NewID
  , MatchedID
  , Name
  , AddDate
  , Address
  , phoneNumber
from NoDuplicates where rownum = 1

SQL Fiddle with demo.

Although there's no reason you can't just use GROUP BY assuming the address information is always duplicated too:

select NewID = min(NewID)
  , MatchedID
  , Name
  , AddDate
  , Address
  , phoneNumber
from Accounts
group by MatchedID
  , Name
  , AddDate
  , Address
  , phoneNumber

SQL Fiddle with demo.

Both of these are returning your expected result.

Edit after comment:

You can group related rows with a statement like this:

with NoDuplicates as
(
  select *
    , rownum = row_number() over (partition by MatchedID order by NewID)
  from Accounts
  where NewID <> MatchedID
)
select groupID = MatchedID
  , Acct = MatchedID
  , FirstName
  , AddDate
  , Address
  , phoneNumber
from NoDuplicates where rownum = 1
union all
select groupID = coalesce(am.MatchedID, a.NewID)
  , Acct = a.MatchedID
  , a.FirstName
  , a.AddDate
  , a.Address
  , a.phoneNumber
from Accounts a
  -- join to the corresponding matched account
  left join Accounts am on a.MatchedID = am.NewID and am.NewID <> am.MatchedID
where a.NewID = a.MatchedID
order by groupID, Acct

SQL Fiddle with demo.

However, this essentially just groups by MatchedID. If you want numbered groups starting from 1, you can add a DENSE_RANK clause to the statement:

with NoDuplicates as
(
  select *
    , rownum = row_number() over (partition by MatchedID order by NewID)
  from Accounts
  where NewID <> MatchedID
)
, GroupedAcct as
(
  select GroupID = MatchedID
    , Acct = MatchedID
    , FirstName
    , AddDate
    , Address
    , phoneNumber
  from NoDuplicates where rownum = 1
  union all
  select GroupID = coalesce(am.MatchedID, a.NewID)
    , Acct = a.MatchedID
    , a.FirstName
    , a.AddDate
    , a.Address
    , a.phoneNumber
  from Accounts a
    -- join to the corresponding matched account
    left join Accounts am on a.MatchedID = am.NewID and am.NewID <> am.MatchedID
  where a.NewID = a.MatchedID
)
select GroupID = Dense_Rank() over (order by GroupID)
  , Acct
  , FirstName
  , AddDate
  , Address
  , phoneNumber
from GroupedAcct
order by groupID, Acct

SQL Fiddle with demo.

SSRS group based on two columns

Answers (1)

Related Questions