Mie
Mie

Reputation: 55

Replace nested/repeated fields

I would like to replace repeated fields (hits.customdimensions.value) while the rest of data should remain unchanged. Let's say I want to hash hits.customdimension.value with MD5 where hits.customdimension.index between 1 and 5 and replace the original values with hashed ones

In case of customdimensions.value, it seems to work this:

select x.* except (customdimensions), cd.index as cdindex, MD5(cd.value) as 
cdvalue
from `datasetx.tabley` x, unnest(customdimensions) as cd
where cd.index between 1 and 5

Unfortunately if I want to do something similar with hits.customdimensions.value, I have to use except (hits) before unnest(hit.customdimensions) which makes other columns disapear.

select x.* except (hits), hitcd.index as hitcdindex, MD5(hitcd.value) as 
hitcdvalue 
from `datasetx.tabley` x, unnest(hits) as hit, unnest(hit.customdimensions) 
as hitcd
where hitcd.index between 1 and 5

Is there any simple solution to this?

Upvotes: 1

Views: 235

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Below fully preserves whole table structure/data and just hashes hits.customdimension.value for hits.customdimension.index between 1 and 5

#standardSQL
SELECT * 
  REPLACE( ARRAY(
    SELECT AS STRUCT * 
      REPLACE( ARRAY(
        SELECT AS STRUCT 
          index, 
          IF(index BETWEEN 1 AND 5, TO_BASE64(MD5(value)), value) value 
        FROM UNNEST(customdimensions)      
      ) AS customdimensions
    )
    FROM UNNEST(hits)
  ) AS hits
)
FROM `datasetx.tabley`

Upvotes: 2

Related Questions