Tone
Tone

Reputation: 783

Eliminating recursive SQL Duplicates

I'm having issues with my script bringing back duplicate names and I'm not sure how to fix it. My script concatenates columns into a row. Basically I have admin names that I need listed in the same row and column of the building they are assigned. The admin names are listed multiple times and when I concatenate the rows it lists the admin name as follows (John Doe | John Doe). Point of Contact and list sometimes contain different names. I need to exclude the duplicate names in the list. I have attached a screen shot of the current and desired results. Using a db2 database but not sure of the version. Your help is appreciated.

       WITH
    /*****************************************************
    *** The cte was used to generate test data easily. ***
    *****************************************************/
    sample_data
( rownum, project_id , project_name ,  name_last , name_first, point_of_contact, building_id, building_name, hours_used,
hours_to_use, percentage_used, capability, bucket_a, bucket_b, bucket_c  ) AS
(
VALUES
  (1, 10 , 'ELITE', 'Gamb', 'Susan', 'Susan Gamb | Amber Jones', 2, 'Main' , 80, 0, 1.0, 6000, 12000, 9600, 5100)
, (2, 10 , 'ELITE', 'Gamb', 'Susan', 'Susan Gamb | Amber Jones', 2 , 'Main', 80,  0, 1.0, 7000, 12500, 8000, 4000)
, (3, 10 , 'ELITE', 'Jones', 'Amber', 'Susan Gamb | Amber Jones', 2, 'Main', 80, 0, 1.0, 6000, 12000, 9600, 5100)
, (4, 10 , 'ELITE', 'Jones', 'Amber', 'Susan Gamb | Amber Jones', 2, 'Main', 80, 0, 1.0, 7000, 12500, 8000, 4000)
, (1, 10 , 'ELITE', 'Gamb', 'Susan', 'Susan Gamb | Amber Jones', 6 ,'Warehouse', 40, 40, .5, 6000, 12500, 9600, 5100)
, (2, 10 , 'ELITE', 'Gamb', 'Susan', 'Susan Gamb | Amber Jones', 6, 'Warehouse' , 40, 40, .5, 7000, 12000, 8000, 4000)
, (3, 10 , 'ELITE', 'Jones', 'Amber',  'Susan Gamb | Amber Jones', 6, 'Warehouse' , 40, 40, .5, 6000, 12500, 9600, 5100)
, (4, 10 , 'ELITE', 'Jones', 'Amber',  'Susan Gamb | Amber Jones', 6, 'Warehouse' , 40, 40, .5, 6000, 12000, 8000, 4000 )
, (1, 1040 , 'ROADRUNNER', 'Sugar', 'Paul', 'Paul Sugar | Rob Brown', 2 , 'Main', 60,  20, .75, 5000, 1000, 1200, 4100 )
, (2, 1040 , 'ROADRUNNER', 'Sugar', 'Paul', 'Paul Sugar | Rob Brown', 2, 'Main',  60, 20, .75, 4000, 1500, 1000, 3000)
, (3, 1040 , 'ROADRUNNER', 'Brown', 'Rob', 'Paul Sugar | Rob Brown', 2, 'Main', 60, 20, .75, 4500, 2000, 1200, 4100)
, (4, 1040 , 'ROADRUNNER', 'Brown', 'Rob', 'Paul Sugar | Rob Brown', 2, 'Main', 60, 20, .75, 4000, 1500, 1000, 3000)
, (1, 1040 , 'ROADRUNNER', 'Sugar', 'Paul', 'Paul Sugar | Rob Brown',  6 , 'Warehouse', 60, 20, .75, 4500, 2000, 1200, 4100)
, (2, 1040 , 'ROADRUNNER', 'Sugar', 'Paul', 'Paul Sugar | Rob Brown',  6, 'Warehouse', 60, 20, .75, 4000, 1500, 1000, 3000)
, (2, 1040 , 'ROADRUNNER',  'Brown', 'Rob', 'Paul Sugar | Rob Brown',  6, 'Warehouse', 60, 20, .75, 4500, 2000, 1200, 4100 )
, (3, 1040 , 'ROADRUNNER',  'Brown', 'Rob', 'Paul Sugar | Rob Brown',  6, 'Warehouse', 60, 20, .75, 4000, 1500, 1000, 3000)

)
    ,

           t2(PROJECT_ID, LIST, POINT_OF_CONTACT, PROJECT_NAME, BUILDING_ID, BUILDING_NAME, HOURS_USED, HOURS_TO_USE, PERCENTAGE_USED, CAPABILITY,
           BUCKET_A, BUCKET_B, BUCKET_C, cnt) AS
        ( SELECT    PROJECT_ID,
                    VARCHAR(NAME_FIRST CONCAT ' ' CONCAT NAME_LAST, 6000),
                    POINT_OF_CONTACT,
                    PROJECT_NAME,
                    BUILDING_ID,
                    BUILDING_NAME,
                    HOURS_USED,
                    HOURS_TO_USE,
                    PERCENTAGE_USED,
                    CAPABILITY,
                    BUCKET_A,
                    BUCKET_B,
                    BUCKET_C,
                    1
        FROM SAMPLE_DATA
                WHERE rowNum = 1
                UNION ALL
        SELECT
            t2.PROJECT_ID,
                    t2.list || ' | ' || SAMPLE_DATA.NAME_FIRST CONCAT ' ' CONCAT SAMPLE_DATA.NAME_LAST,
                    SAMPLE_DATA.POINT_OF_CONTACT,
                    SAMPLE_DATA.PROJECT_NAME,
                    SAMPLE_DATA.BUILDING_ID,
                    SAMPLE_DATA.BUILDING_NAME,
                    SAMPLE_DATA.HOURS_USED,
                    SAMPLE_DATA.HOURS_TO_USE,
                    SAMPLE_DATA.PERCENTAGE_USED,
                    SAMPLE_DATA.CAPABILITY,
                    SAMPLE_DATA.BUCKET_A,
                    SAMPLE_DATA.BUCKET_B,
                    SAMPLE_DATA.BUCKET_C,
                    t2.cnt + 1
        FROM t2, SAMPLE_DATA

                WHERE t2.PROJECT_ID = SAMPLE_DATA.PROJECT_ID
                AND          t2.BUILDING_ID = SAMPLE_DATA.BUILDING_ID
                AND   t2.cnt + 1 = SAMPLE_DATA.rowNum
                 )    
        SELECT
                   PROJECT_ID,
                   PROJECT_NAME,
                   POINT_OF_CONTACT,
                   BUILDING_ID,
                   BUILDING_NAME,
                   HOURS_USED,
                   HOURS_TO_USE,
                   PERCENTAGE_USED,
                   CAPABILITY,
                   BUCKET_A,
                   BUCKET_B,
                   BUCKET_C,
                   list
        FROM t2
                WHERE ( PROJECT_ID, BUILDING_ID, cnt ) IN (
        SELECT PROJECT_ID, BUILDING_ID, MAX(rowNum)
        FROM SAMPLE_DATA
                GROUP BY PROJECT_ID, BUILDING_ID )    
                order by PROJECT_NAME                  

Desired Results: enter image description here

Upvotes: 0

Views: 223

Answers (1)

Vulcronos
Vulcronos

Reputation: 3456

I was able to test your query in MSSQL server and got it working. I did my best to look up the equivalent functions in db2. Even if this doesn't work straight out, I hope it puts you on the right path.

In your second select for t2, I changed:

t2.list || ' | ' || SAMPLE_DATA.NAME_FIRST CONCAT ' ' CONCAT SAMPLE_DATA.NAME_LAST,

to

        WHEN LOCATE(SAMPLE_DATA.NAME_FIRST CONCAT ' ' CONCAT SAMPLE_DATA.NAME_LAST, t2.list) > 0 THEN t2.list
        ELSE t2.list || ' | ' || SAMPLE_DATA.NAME_FIRST || ' ' || SAMPLE_DATA.NAME_LAST
        END,

Basically, we are checking if that person is already in the list, if they are, return the current list, if they aren't, add them to the list.

Upvotes: 1

Related Questions