Reputation: 61
I have a array of string called tags in my hive table. I want to join each element from this array with a ',' delimiter and form a string. However, while joining I don't want to join the string if it doesn't include ':' character in it.
The reason I want to do this is, once string is joined, it should become, a:1,b:2,c:3. Then I want to apply str_to_map to make it a dictionary. Right now my str_to_map is failing if input doesn't include ':' in even of the elements.
Minimum input to fail: ["abc-xyz-1"]
SELECT
CAST(SPLIT(hostname, '-')[1] AS BIGINT) AS host_id,
str_to_map(concat_ws(',', tags)) AS tags,
stack_trace
FROM test_events;
Upvotes: 0
Views: 258
Reputation: 35563
I would suggest "exploding" the array data into separate rows, which allows use of an IF on each element inside the concat_ws. Like this (untested code):
SELECT
CAST(SPLIT(hostname, '-')[1] AS BIGINT) AS host_id,
str_to_map(concat_ws(',',
IF(instr(tag, ':') > 0, tag, NULL)
)) AS tags,
stack_trace
FROM test_events
LATERAL VIEW explode(tags) exploded_tags AS tag
GROUP BY CAST(SPLIT(hostname, '-')[1] AS BIGINT), stack_trace;
refs: lateral view & explode
I have to admit, not entirely sure if the group by is needed, but I have assumed it is.
I suggest experimentation with this snippet:
SELECT hostname, stack_trace, tag
FROM test_events
LATERAL VIEW explode(tags) exploded_tags AS tag
WHERE instr(tag, ':') > 0
Should avoid the tags without the ":", and if you change the where clause you could locate the tags that are non-compliant.
Upvotes: 0