Karl
Karl

Reputation: 5822

SQL Server 2008: produce table of unique entries

I have the following problem. I have a table with a few hundred thousand records, which has the following identifiers (for simplicity)

MemberID      SchemeName        BenefitID        BenefitAmount
10            ABC               1                10000
10            ABC               1                2000
10            ABC               2                5000
10            A.B.C             3                11000

What I need to do is to convert this into a single record that looks like this:

MemberID      SchemeName        B1       B2      B3        
10            ABC               12000    5000    11000     

The problem of course being that I need to differentiate by SchemeName, and for most records this won't be a problem, but for some SchemeName wouldn't be captured properly. Now, I don't particularly care if the converted table uses "ABC" or "A.B.C" as scheme name, as long as it just uses 1 of them.

I'd love hear your suggestions.

Thanks
Karl

(Using SQL Server 2008)

Upvotes: 1

Views: 148

Answers (3)

HLGEM
HLGEM

Reputation: 96572

The schemename issue is something that will have to be dealt with manually since the names can be so different. This indicates first and foremost a problem with how you are allowing data entry. You should not have these duplicate schemenames.

However since you do, I think the best thing is to create cross reference table that has two columns, something like recordedscheme and controlling scheme. Select distinct scheme name to create a list of possible schemenames and insert into the first column. Go through the list and determine what the schemename you want to use for each one is (most willbe the same as the schemename). Once you have this done, you can join to this table to get the query. This will work for the current dataset, however, you need to fix whatever is causeing the schemename to get duplicated beofre going further. YOu will also want to fix it so when a schemename is added, you table is populated with the new schemename in both columns. Then if it later turns out that a new one is a duplicate, all you have to do is write a quick update to the second column showing which one it really is and boom you are done.

The alternative is to actually update the schemenames that are bad in the data set to the correct one. Depending on how many records you have to update and in how many tables, this might be a performance issue.This too is only good for querying the data right now and doesn't address how to fix the data going forth.

Upvotes: 0

KM.
KM.

Reputation: 103587

based on the limited info in the original question, give this a try:

DECLARE @YourTable  table(MemberID int, SchemeName varchar(10), BenefitID int, BenefitAmount int)

INSERT INTO @YourTable VALUES (10,'ABC'  ,1,10000)
INSERT INTO @YourTable VALUES (10,'ABC'  ,1,2000)
INSERT INTO @YourTable VALUES (10,'ABC'  ,2,5000)
INSERT INTO @YourTable VALUES (10,'A.B.C',3,11000)
INSERT INTO @YourTable VALUES (11,'ABC'  ,1,10000)
INSERT INTO @YourTable VALUES (11,'ABC'  ,1,2000)
INSERT INTO @YourTable VALUES (11,'ABC'  ,2,5000)
INSERT INTO @YourTable VALUES (11,'A.B.C',3,11000)
INSERT INTO @YourTable VALUES (10,'mnp',3,11000)
INSERT INTO @YourTable VALUES (11,'mnp'  ,1,10000)
INSERT INTO @YourTable VALUES (11,'mnp'  ,1,2000)
INSERT INTO @YourTable VALUES (11,'mnp'  ,2,5000)
INSERT INTO @YourTable VALUES (11,'mnp',3,11000)

SELECT
    MemberID, REPLACE(SchemeName,'.','') AS SchemeName
        ,SUM(CASE WHEN BenefitID=1 THEN BenefitAmount ELSE 0 END) AS B1
        ,SUM(CASE WHEN BenefitID=2 THEN BenefitAmount ELSE 0 END) AS B2
        ,SUM(CASE WHEN BenefitID=3 THEN BenefitAmount ELSE 0 END) AS B3
    FROM @YourTable
    GROUP BY MemberID, REPLACE(SchemeName,'.','')
    ORDER BY MemberID, REPLACE(SchemeName,'.','')

OUTPUT:

MemberID    SchemeName  B1          B2          B3
----------- ----------- ----------- ----------- -----------
10          ABC         12000       5000        11000
10          mnp         0           0           11000
11          ABC         12000       5000        11000
11          mnp         12000       5000        11000

(4 row(s) affected)

Upvotes: 2

Prashant Lakhlani
Prashant Lakhlani

Reputation: 5806

It looks that PIVOTS can help

Upvotes: 0

Related Questions