Reputation: 1260
I currently have a table in snowflake as below
Address Zip State
123 St. 94143 CA
3432 St. 93059 TX
I wan to create a json block as below:
{
"Address" : 123 St.,
"Zip" : 93059,
"State" : CA
},
{
"Address" : 3432 St.,
"Zip" : 94143,
"State" : TX
}
I have the code below:
select
OBJECT_CONSTRUCT (
'Address', Address ,
'Zip', Zip,
'State', State )
as json_value
from example_table_above;
The above currently returns sometimes single records of each key pair value, but not in a blob at all times.
ie it returns just this - no real patttern or seperation by record:
"JSON_VALUE"
{"Address":adsf}
{"Address":"Triang St"}
{"Zip":949}
{"State":CA}
Upvotes: 1
Views: 5935
Reputation: 601
Can you provide a bit more detail on what you need to do with this large JSON object(s)? Are you looking to output it from Snowflake as result set or to a file, or do you want to insert it back into another table in Snowflake?
With the unconstrained array_agg all qualifying rows from your query are being grouped into a single object, and if your table exceeds the 16777216 byte array/object limit, as in your case, you get the array size error your seeing.
Array_Agg and Object_Agg are aggregate functions, so if there is a grouping you can use to keep each Array/Object within the size limit you could use that?
e.g. grouping by STATE reduces each JSON object by ~1/50th in size.
Select
State,
ARRAY_AGG(
OBJECT_CONSTRUCT (
'Address', Address ,
'Zip', Zip,
'State', State )
)
as json_value
from T
Group By State;
How to unload single large JSON Object > 16777216 Bytes
Create some chunky test data
Create or Replace Table T as
Select
(MOD(ABS(RANDOM()),999)||' '||
UPPER(RANDSTR(
ABS(MOD(RANDOM(),20)),
RANDOM() ))||' '||
ARRAY_CONSTRUCT( 'Rd.', 'St.', 'Av.', 'Way', 'Dr.','Cl.')[MOD(ABS(RANDOM()),6)]) ADDRESS,
RIGHT(RANDOM(),5) ZIP,
ARRAY_CONSTRUCT( 'AL', 'AK', 'AS', 'AZ', 'AR', 'CA', 'CO', 'CT', 'DE', 'DC',
'FM', 'FL', 'GA', 'GU', 'HI', 'ID', 'IL', 'IN', 'IA', 'KS',
'KY', 'LA', 'ME', 'MH', 'MD', 'MA', 'MI', 'MN', 'MS', 'MO',
'MT', 'NE', 'NV', 'NH', 'NJ', 'NM', 'NY', 'NC', 'ND', 'MP',
'OH', 'OK', 'OR', 'PW', 'PA', 'PR', 'RI', 'SC', 'SD', 'TN',
'TX', 'UT', 'VT', 'VI', 'VA', 'WA', 'WV', 'WI', 'WY')[MOD(ABS(RANDOM()),59)]::VARCHAR STATE
from
table(GENERATOR( ROWCOUNT => 400000))
;
Create a view to construct JSON data as multi-line text
Create or Replace View V as
With
rc as (Select count(*) row_count from T),
ro as (Select
row_number() over ( order by NULL) as RID,
(object_construct(*))::VARCHAR row_obj
from T),
json_data as (
Select 0 as RID, '['::VARCHAR as JSON_LINE
UNION ALL
Select RID, row_obj as JSON_LINE from ro where RID = 1
UNION ALL
Select RID, ','||row_obj as JSON_LINE from ro where RID > 1
UNION ALL
Select row_count+1 as RID, ']'::VARCHAR as JSON_LINE from rc
)
Select RID, JSON_LINE from json_data
order by RID
;
Check the view output.
Select the 10 first and last records, including the array '[' ']' rows
Select RID, JSON_LINE from V where RID <=10 or RID >= 399990;
Create CSV stage to unload into
Note: file_format options to ensure escape characters aren't introduced
create or replace stage T_STAGE_CSV file_format = (
type = 'CSV'
skip_header = 1
ESCAPE = ' '
COMPRESSION = NONE);
Copy data from the View into the Stage.
copy into @TEST.TEST.T_STAGE_CSV from (Select JSON_LINE from V);
Check stage as output may get split across more than one file, so you will need to concatenate the files together externally of Snowflake!
list @TEST.TEST.T_STAGE_CSV;
Pull files to your client.
GET @TEST.TEST.T_STAGE_CSV/ file:////Users/fieldy/Documents/_sql_code/data;
Concatenate the files together externally e.g.
cat /Users/fieldy/Documents/_sql_code/data/data* > /Users/fieldy/Documents/_sql_code/datadata.json
Upvotes: 1
Reputation: 11046
This is a subtle point, but the JSON block you want is not valid. In order to make it valid, it needs to be wrapped with an outer array:
[
{
"Address": "123 St.",
"State": "CA",
"Zip": 94143
},
{
"Address": "3422 St",
"State": "TX",
"Zip": 93059
}
]
To get Snowflake to construct a JSON document like that, simply wrap the OBJECT_CONSTRUCT
function with ARRAY_AGG
like this:
select array_agg(object_construct('Address', address, 'Zip', zip, 'State', state)) from T1;
Upvotes: 1