function
function

Reputation: 1330

Replace Column Values in SQL from a lookup Table

I have a sql Employee Table which describes how much a user likes a particular metal , the table looks like this

    "Employee_number" "Rank_1  "Rank_2   "Rank_3   "Rank_4   "Rank5
                   1    Gold    null      null      null    null
                   2    bronze  Gold      null      null    null
                   3    Gold    platinum  null      null    null
                   4    Gold    copper    null      null    null
                   5    Gold    bronze    platinum  null    null
                   6    Gold    bronze    platinum  null    null
                   7    Gold    platinum  Silver    null    null
                   8    Gold    platinum  Silver    null    null
                   9    Gold    platinum  business  null    null
                   10   null    null      null      null    null
                   11   Silver  bronze    business  platinum Gold

The Employee_number field is a unique field ,

There is also a table which describes the general ranking of the Metals, it looks like this :

Metal     Rank
Gold        1
platinum    2
silver      3
copper      4
bronze      5

What i am trying to do is that whenever there is a null value for an employee , fill in with the default metals based on their ranking

eg -> For Employee 10 : All values are null , easy , His rank_1 metal would be Gold , rank2_metal would be Platinum , rank3_metal would be Silver , rank 4_metal would be copper , rank 5_metal would be bronze

Now for employee_1 , he has a rank_1 metal already , but no other ranks are avialble , so replace rank2_metal with Platinum , rank_3 metal with silver , rank_4 metal with copper , rank_5 metal with bronze

Now for employee_2 , he has bronze as his first metal and Gold as the second metal , his rank3_metal would be Platinum , rank_4 metal would silver , rank5_metal would be copper

similary , lets take the case of employee_6 he has three ranks filled , need to fill in rank 4 and 5 , his rank_4 metals would be silver and rank_5 metal would be copper

Does anyone have any suggestion on how this could be one in sql , i am using bigquery

Upvotes: 1

Views: 1201

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Below is for BigQuery Standard SQL - hope you will adopt this for your real use case.

#standardSQL
WITH metals AS (
  SELECT 'Gold' Metal, 1 RANK UNION ALL SELECT 'platinum', 2 UNION ALL
  SELECT 'silver', 3 UNION ALL SELECT 'copper', 4 UNION ALL SELECT 'bronze', 5 
)
SELECT Employee_number, 
  MAX(IF(pos=0, Metal, NULL)) Rank_1,
  MAX(IF(pos=1, Metal, NULL)) Rank_2,
  MAX(IF(pos=2, Metal, NULL)) Rank_3,
  MAX(IF(pos=3, Metal, NULL)) Rank_4,
  MAX(IF(pos=4, Metal, NULL)) Rank_5
FROM (
  SELECT Employee_number,
    ARRAY_CONCAT(
      ARRAY(SELECT Metal FROM (
          SELECT 1 a, Rank_1 Metal UNION ALL SELECT 2, Rank_2 UNION ALL 
          SELECT 3, Rank_3 UNION ALL SELECT 4, Rank_4 UNION ALL 
          SELECT 5, Rank_5 )
        WHERE NOT Metal IS NULL
        ORDER BY a
      ), ARRAY(SELECT Metal FROM metals m
        WHERE NOT LOWER(Metal) IN (
          SELECT x FROM UNNEST(ARRAY(
            SELECT LOWER(b) FROM (
              SELECT Rank_1 b UNION ALL SELECT Rank_2 UNION ALL
              SELECT Rank_3 UNION ALL SELECT Rank_4 UNION ALL
              SELECT Rank_5 )
            WHERE NOT b IS NULL
          )) x
        ) ORDER BY RANK
      )) arr
  FROM `project.dataset.employee`
), UNNEST(arr) Metal WITH OFFSET pos  
GROUP BY Employee_number
ORDER BY Employee_number    

You can test, play with above using dummy data from your question as below

#standardSQL
WITH `project.dataset.employee` AS (
  SELECT 1 Employee_number, 'Gold' Rank_1, NULL Rank_2, NULL Rank_3, NULL Rank_4, NULL Rank_5 UNION ALL
  SELECT 2, 'bronze', 'Gold', NULL, NULL, NULL UNION ALL
  SELECT 3, 'Gold', 'platinum', NULL, NULL, NULL UNION ALL
  SELECT 4, 'Gold', 'copper', NULL, NULL, NULL UNION ALL
  SELECT 5, 'Gold', 'bronze', 'platinum', NULL, NULL UNION ALL
  SELECT 6, 'Gold', 'bronze', 'platinum', NULL, NULL UNION ALL
  SELECT 7, 'Gold', 'platinum', 'Silver', NULL, NULL UNION ALL
  SELECT 8, 'Gold', 'platinum', 'Silver', NULL, NULL UNION ALL
  SELECT 9, 'Gold', 'platinum', 'business', NULL, NULL UNION ALL
  SELECT 10, NULL, NULL, NULL, NULL, NULL UNION ALL
  SELECT 11, 'Silver', 'bronze', 'business', 'platinum',  'Gold' 
), metals AS (
  SELECT 'Gold' Metal, 1 RANK UNION ALL SELECT 'platinum', 2 UNION ALL
  SELECT 'silver', 3 UNION ALL SELECT 'copper', 4 UNION ALL SELECT 'bronze', 5 
)
SELECT Employee_number, 
  MAX(IF(pos=0, Metal, NULL)) Rank_1,
  MAX(IF(pos=1, Metal, NULL)) Rank_2,
  MAX(IF(pos=2, Metal, NULL)) Rank_3,
  MAX(IF(pos=3, Metal, NULL)) Rank_4,
  MAX(IF(pos=4, Metal, NULL)) Rank_5
FROM (
  SELECT Employee_number,
    ARRAY_CONCAT(
      ARRAY(SELECT Metal FROM (
          SELECT 1 a, Rank_1 Metal UNION ALL SELECT 2, Rank_2 UNION ALL 
          SELECT 3, Rank_3 UNION ALL SELECT 4, Rank_4 UNION ALL 
          SELECT 5, Rank_5 )
        WHERE NOT Metal IS NULL
        ORDER BY a
      ), ARRAY(SELECT Metal FROM metals m
        WHERE NOT LOWER(Metal) IN (
          SELECT x FROM UNNEST(ARRAY(
            SELECT LOWER(b) FROM (
              SELECT Rank_1 b UNION ALL SELECT Rank_2 UNION ALL
              SELECT Rank_3 UNION ALL SELECT Rank_4 UNION ALL
              SELECT Rank_5 )
            WHERE NOT b IS NULL
          )) x
        ) ORDER BY RANK
      )) arr
  FROM `project.dataset.employee`
), UNNEST(arr) Metal WITH OFFSET pos  
GROUP BY Employee_number
ORDER BY Employee_number      

with result

Row Employee_number Rank_1  Rank_2      Rank_3      Rank_4      Rank_5   
1   1               Gold    platinum    silver      copper      bronze   
2   2               bronze  Gold        platinum    silver      copper   
3   3               Gold    platinum    silver      copper      bronze   
4   4               Gold    copper      platinum    silver      bronze   
5   5               Gold    bronze      platinum    silver      copper   
6   6               Gold    bronze      platinum    silver      copper   
7   7               Gold    platinum    Silver      copper      bronze   
8   8               Gold    platinum    Silver      copper      bronze   
9   9               Gold    platinum    business    silver      copper   
10  10              Gold    platinum    silver      copper      bronze   
11  11              Silver  bronze      business    platinum    Gold     

Note: above solution assumes there is no mix between filled and NULL Metal, meaning three options:

1. all Rank fields filled already with Metal  
2. all Rank fields are NULL
3. first 1 or more fields filled with Metal and rest are NULLs 

With this said, first array is built off of filled fields; second array built off of rest of Metal fields from Metals table; then two arrays are concatenated and first 5 elements are used to recreate original table

Hope this is not too messy

P.S. above solution can relatively easy be extended to the case when NULLs and filled Metals are in mix - but looks like this is out of question's scope :o)

Upvotes: 3

Related Questions