Reputation: 3034
I am trying to aggregate data from one table into another. I've inherited this project; I did not design this database nor will I be able to change its format.
The [RawData] table will have 1 record per account, per ChannelCodeID. This table (where I currently have data) has the following fields:
[Account] int
[ChannelCodeID] int
[ChannelCode] varchar(10)
The [AggregatedData] table will have 1 record per account. This table (into which I need to insert data) has the following fields:
[Account] int
[Count] int
[Channel1] int
[Channel2] int
[Channel3] int
[Names] varchar(250)
For example, I might have the following records in my [RawData] table:
Account ChannelCodeID ChannelCode
12345 2 ABC
12345 4 DEF
12345 6 GHI
54321 2 ABC
54321 6 GHI
99999 2 ABC
And, after aggregating them, I would need to produce the following records in my [AggregatedData] table:
Account Count Chanel1 Channel2 Channel3 Names
12345 3 2 4 6 ABC.DEF.GHI
54321 2 2 6 0 ABC.GHI
99999 1 2 0 0 ABC
As you can see, the count is how many records exist in my [RawData] table, Channel1 is the first ChannelCodeID, Channel2 is the second, and Channel3 is the third. If there are not enough ChannelCodeIDs from my [RawData] table, extra Channel columns get a '0' value. Furthermore, I need to concatenate the 'ChannelCode' column and store it in the 'Names' column of the [AggregatedData] table, but (obviously) if there is only one record, I don't want to add the '.'
I can't figure out how to do this without using a cursor and a bunch of variables - but I'm guessing there HAS to be a better way. This doesn't have to be super-fast since it will only run once a month, but it will have to process at least 10-15,000 records each time.
Thanks in advance...
EDIT:
ChannelCodes and ChannelCodeIDs map directly to each other and are always the same. For example, ChannelCodeID 2 is ALWAYS 'ABC'
Also, in the [AggregatedData] table, Channel1 is ALWAYS the lowest value, although this is incidental.
Upvotes: 0
Views: 72
Reputation: 2733
WITH CTE AS (SELECT Account, ChannelCodeID, ChannelCode, RANK() OVER (PARTITION BY Account ORDER BY ChannelCodeID) [ChRank] FROM RawData)
SELECT A.Account, COUNT(Account) [Count], ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=1),0) [Channel1],
ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=2),0) [Channel2],
ISNULL((SELECT TOP 1 ChannelCodeID FROM CTE WHERE A.Account=CTE.Account AND ChRank=3),0) [Channel3],
STUFF((SELECT '.'+ChannelCode FROM CTE WHERE A.Account=CTE.Account FOR XML PATH('')),1,1,'') [Names]
FROM RawData A
GROUP BY A.Account
This uses a Common Table Expression
to group and then display the data.
Upvotes: 0
Reputation: 15075
-- Back up raw data into temp table
select * into #rawData FROM RawData
-- First, populate the lowest channel and base records
INSERT INTO AggregatedData (Account,Count,Channel1,Channel2,Channel3)
SELECT AccountID,1,Min(ChannelCODEID),0,0
FROM #RawData
GROUP BY AccountID
-- Gives you something like this
Account Count Chanel1 Channel2 Channel3 Names
12345 1 2 0 0 NULL
54321 1 2 6 0 NULL
99999 1 2 0 0 NULL
--
DELETE FROM #rawData
WHERE account + str(channelCodeID) in
(SELECT account + str(channelCodeID) FROM AggregatedData)
-- Now do an update
UPDATE AggregatedData SET channel2= xx.NextLowest,count= count+1
FROM
( SELECT AccountID,Min(ChannelCODEID) as NextLowest
FROM #RawData
GROUP BY AccountID ) xx
WHERE AggregatedData.account=xx.accountID
-- Repeat above for Channel3
You then need an update statement against the final aggregated table based on the channel id's. If not run often, I would suggest a UDF which takes 3 parameters and returns a string, some like
UPDATE AggregatedData SET [names] = dbo.BuildNameList(channel1,channel2,channel3)
Will run a bit slow, but still not bad overall
Hope this gives you some ideas
Upvotes: 1
Reputation: 69524
DECLARE @TABLE TABLE (Account INT, ChannelCodeID INT, ChannelCode VARCHAR(10))
INSERT INTO @TABLE VALUES
(12345 ,2 ,'ABC'),
(12345 ,4 ,'DEF'),
(12345 ,6 ,'GHI'),
(54321 ,2 ,'ABC'),
(54321 ,6 ,'GHI'),
(99999 ,2 ,'ABC')
SELECT Account
,[Count]
,ISNULL([Channel1], 0) AS [Channel1]
,ISNULL([Channel2], 0) AS [Channel2]
,ISNULL([Channel3], 0) AS [Channel3]
,Names
FROM
(
SELECT t.Account, T.ChannelCodeID, C.[Count]
,'Channel' + CAST(ROW_NUMBER() OVER
(PARTITION BY t.Account ORDER BY t.ChannelCodeID ASC) AS VARCHAR(10))Channels
,STUFF((SELECT '.' + ChannelCode
FROM @TABLE
WHERE Account = t.Account
FOR XML PATH(''),TYPE).value('.','NVARCHAR(MAX)'),1,1,'') AS Names
FROM @TABLE t INNER JOIN (SELECT Account , COUNT(*) AS [Count]
FROM @TABLE
GROUP BY Account) c
ON T.Account = C.Account
)A
PIVOT (MAX(ChannelCodeID)
FOR Channels
IN ([Channel1],[Channel2],[Channel3])
) p
╔═════════╦═══════╦══════════╦══════════╦══════════╦═════════════╗
║ Account ║ Count ║ Channel1 ║ Channel2 ║ Channel3 ║ Names ║
╠═════════╬═══════╬══════════╬══════════╬══════════╬═════════════╣
║ 12345 ║ 3 ║ 2 ║ 4 ║ 6 ║ ABC.DEF.GHI ║
║ 54321 ║ 2 ║ 2 ║ 6 ║ 0 ║ ABC.GHI ║
║ 99999 ║ 1 ║ 2 ║ 0 ║ 0 ║ ABC ║
╚═════════╩═══════╩══════════╩══════════╩══════════╩═════════════╝
Upvotes: 3