0004
0004

Reputation: 1260

Snowflake - Object Construct - Json Value

I currently have a table in snowflake as below

Address    Zip      State
123 St.    94143    CA
3432 St.   93059    TX

I wan to create a json block as below:

{
"Address" : 123 St.,  
 "Zip" : 93059,
 "State" : CA
},
{
"Address" : 3432 St.,  
 "Zip" : 94143,
 "State" : TX
}

I have the code below:

select

OBJECT_CONSTRUCT (
'Address', Address  ,
'Zip', Zip,
'State', State  ) 

as json_value

from  example_table_above;

The above currently returns sometimes single records of each key pair value, but not in a blob at all times.

ie it returns just this - no real patttern or seperation by record:

"JSON_VALUE"
{"Address":adsf}
{"Address":"Triang St"}
{"Zip":949}
{"State":CA}

Upvotes: 1

Views: 5935

Answers (2)

Fieldy
Fieldy

Reputation: 601

Can you provide a bit more detail on what you need to do with this large JSON object(s)? Are you looking to output it from Snowflake as result set or to a file, or do you want to insert it back into another table in Snowflake?

With the unconstrained array_agg all qualifying rows from your query are being grouped into a single object, and if your table exceeds the 16777216 byte array/object limit, as in your case, you get the array size error your seeing.

Array_Agg and Object_Agg are aggregate functions, so if there is a grouping you can use to keep each Array/Object within the size limit you could use that?

e.g. grouping by STATE reduces each JSON object by ~1/50th in size.

Select 
    State,
    ARRAY_AGG(
       OBJECT_CONSTRUCT (
         'Address', Address  ,
         'Zip', Zip,
         'State', State  )
            )
as json_value
from  T
Group By State;

How to unload single large JSON Object > 16777216 Bytes

Create some chunky test data

Create or Replace  Table T as
Select 
   (MOD(ABS(RANDOM()),999)||' '||
   UPPER(RANDSTR(
         ABS(MOD(RANDOM(),20)),
         RANDOM() ))||' '||
   ARRAY_CONSTRUCT( 'Rd.', 'St.', 'Av.', 'Way', 'Dr.','Cl.')[MOD(ABS(RANDOM()),6)]) ADDRESS, 
   RIGHT(RANDOM(),5) ZIP, 
   ARRAY_CONSTRUCT( 'AL', 'AK', 'AS', 'AZ', 'AR', 'CA', 'CO', 'CT', 'DE', 'DC',
                    'FM', 'FL', 'GA', 'GU', 'HI', 'ID', 'IL', 'IN', 'IA', 'KS',
                    'KY', 'LA', 'ME', 'MH', 'MD', 'MA', 'MI', 'MN', 'MS', 'MO', 
                    'MT', 'NE', 'NV', 'NH', 'NJ', 'NM', 'NY', 'NC', 'ND', 'MP',
                    'OH', 'OK', 'OR', 'PW', 'PA', 'PR', 'RI', 'SC', 'SD', 'TN',
                    'TX', 'UT', 'VT', 'VI', 'VA', 'WA', 'WV', 'WI', 'WY')[MOD(ABS(RANDOM()),59)]::VARCHAR  STATE
   from
table(GENERATOR( ROWCOUNT => 400000))
;

Create a view to construct JSON data as multi-line text

Create or Replace View V as 
With 
   rc as (Select count(*) row_count from T),
   ro as (Select 
              row_number() over ( order by NULL) as RID, 
              (object_construct(*))::VARCHAR row_obj 
            from T),
   json_data as (
     Select 0 as RID, '['::VARCHAR as JSON_LINE
       UNION ALL 
     Select RID, row_obj as JSON_LINE from ro where RID = 1
       UNION ALL
     Select  RID, ','||row_obj as JSON_LINE from ro where RID > 1
       UNION ALL
     Select row_count+1 as RID, ']'::VARCHAR as JSON_LINE from rc
)
Select RID, JSON_LINE from json_data
order by RID
;

Check the view output.
Select the 10 first and last records, including the array '[' ']' rows

Select RID, JSON_LINE from V where RID  <=10 or RID >= 399990;

Create CSV stage to unload into

Note: file_format options to ensure escape characters aren't introduced

create or replace stage  T_STAGE_CSV file_format = (
                                      type = 'CSV' 
                                      skip_header = 1 
                                      ESCAPE = ' ' 
                                      COMPRESSION = NONE);

Copy data from the View into the Stage.

copy into @TEST.TEST.T_STAGE_CSV from (Select JSON_LINE from V);

Check stage as output may get split across more than one file, so you will need to concatenate the files together externally of Snowflake!

list @TEST.TEST.T_STAGE_CSV;

Pull files to your client.

GET @TEST.TEST.T_STAGE_CSV/ file:////Users/fieldy/Documents/_sql_code/data;

Concatenate the files together externally e.g.

cat /Users/fieldy/Documents/_sql_code/data/data* > /Users/fieldy/Documents/_sql_code/datadata.json

Upvotes: 1

Greg Pavlik
Greg Pavlik

Reputation: 11046

This is a subtle point, but the JSON block you want is not valid. In order to make it valid, it needs to be wrapped with an outer array:

[
  {
    "Address": "123 St.",
    "State": "CA",
    "Zip": 94143
  },
  {
    "Address": "3422 St",
    "State": "TX",
    "Zip": 93059
  }
]

To get Snowflake to construct a JSON document like that, simply wrap the OBJECT_CONSTRUCT function with ARRAY_AGG like this:

select array_agg(object_construct('Address', address, 'Zip', zip, 'State', state)) from T1;

Upvotes: 1

Related Questions