Reputation: 23670
I have an AWS Athena (parquet) database that in the Athena console returns:
.. from the following query:
SELECT id,
TYPEOF(organisations) as type,
cardinality(organisations) as len,
organisations
FROM mydb.tbl;
Reading the full table remotely into Python (with awswrangler) is failing because organisations
is array (sometimes zero length). I've been trying to convert the organisaions
field to a string representation of the array (or equivalent json) but nothing I've tried so far works. Any sugestions much appreciated.
Upvotes: 0
Views: 2199
Reputation: 1305
If your array elements can be cast to varchar
, you can convert the array to string using the array_join
function, see the docs here
https://prestodb.io/docs/current/functions/array.html
For example:
WITH t(organizations) AS (
VALUES
ARRAY[1,2,3,4,5],
ARRAY[]
)
SELECT array_join(organizations, ',')
FROM t
If your arrays contains more complex elements, you can first use the transform
function to convert them to varchar "customly" and feed that to the array_join
function, for example:
WITH t(organisations) AS (
VALUES
ARRAY[cast(ROW('Police Scotland', 6) as ROW(organisation varchar, charoffset integer))],
ARRAY[]
)
SELECT '{' || array_join(transform(organisations, x -> '{organisation='||x.organisation||',charoffset='||cast(x.charoffset as varchar)||'}'),',') || '}'
FROM t
Upvotes: 0
Reputation: 5316
If you don't care about preserving the organization
and charoffset
field names, you can cast the array to JSON
and then format it as varchar
with json_format
:
WITH t(organizations) AS (
VALUES
ARRAY[ROW('x', 1), ROW('y', 2)],
ARRAY[ROW('a', 3), ROW('b', 4)],
ARRAY[]
)
SELECT json_format(CAST(organizations AS JSON))
FROM t
=>
_col0
-------------------
[["x",1],["y",2]]
[["a",3],["b",4]]
[]
Upvotes: 1