Reputation: 34703
Pretty straightforward. I have an array-like column encoded as a string (varchar
) and want to cast it to array
(so I can then explode
it and manipulate the elements in "long" format).
The two most natural approaches don't seem to work:
-- just returns a length-1 array with a single string element '[1, 2, 3]'
select array('[1, 2, 3]')
-- errors: DataType array is not supported.
select cast('[1, 2, 3]' as array)
The ugly/inelegant/circuitous way to get what I want is:
select explode(split(replace(replace('[1, 2, 3]', '['), ']'), ', '))
-- '1'
-- '2'
-- '3'
(regexp_replace
could subsume the two replace
but regex with square brackets are always a pain; ltrim
and rtrim
or trim(BOTH '[]'...)
could also be used)
Is there any more concise way to go about this? I'm on Spark 2.3.1.
Upvotes: 0
Views: 507
Reputation: 11244
I am assuming here that the elements are digits. But you get the idea
>>> s = '[1,2,3]'
>>> list(c for c in s if c.isdigit())
['1', '2', '3']
>>> map(int, list(c for c in s if c.isdigit()))
[1, 2, 3]
Upvotes: -1