Reputation: 55
I would like to replace repeated fields (hits.customdimensions.value) while the rest of data should remain unchanged. Let's say I want to hash hits.customdimension.value with MD5 where hits.customdimension.index between 1 and 5 and replace the original values with hashed ones
In case of customdimensions.value, it seems to work this:
select x.* except (customdimensions), cd.index as cdindex, MD5(cd.value) as
cdvalue
from `datasetx.tabley` x, unnest(customdimensions) as cd
where cd.index between 1 and 5
Unfortunately if I want to do something similar with hits.customdimensions.value, I have to use except (hits) before unnest(hit.customdimensions) which makes other columns disapear.
select x.* except (hits), hitcd.index as hitcdindex, MD5(hitcd.value) as
hitcdvalue
from `datasetx.tabley` x, unnest(hits) as hit, unnest(hit.customdimensions)
as hitcd
where hitcd.index between 1 and 5
Is there any simple solution to this?
Upvotes: 1
Views: 235
Reputation: 172993
Below fully preserves whole table structure/data and just hashes hits.customdimension.value
for hits.customdimension.index
between 1 and 5
#standardSQL
SELECT *
REPLACE( ARRAY(
SELECT AS STRUCT *
REPLACE( ARRAY(
SELECT AS STRUCT
index,
IF(index BETWEEN 1 AND 5, TO_BASE64(MD5(value)), value) value
FROM UNNEST(customdimensions)
) AS customdimensions
)
FROM UNNEST(hits)
) AS hits
)
FROM `datasetx.tabley`
Upvotes: 2