Reputation: 51
I'm trying to run the snowflake copy into
command using a snowflake staged CSV file.
my command looks like this COPY INTO test_table FROM @my_stage FILE_FORMAT = (TYPE = CSV SKIP_HEADER = 1) PURGE = TRUE;
The CSV file data looks like this
id,name,age,email,country,score,subscribed,preferences.theme,preferences.notifications,hobbies.0,hobbies.1
1,AbcDefGhiJ,24,[email protected],USA,95.5,true,dark,true,reading,traveling
2,JklMnoPqrS,37,[email protected],Canada,88.3,false,light,false,sports,music
3,TuvWxyZabC,29,[email protected],UK,92.0,true,dark,true,cooking,hiking
4,DefGhiJklM,45,[email protected],Australia,85.7,false,light,true,gardening,photography
5,PqrStuVwxY,53,[email protected],Germany,90.2,true,dark,false,writing,fishing
Where the nested structure is represented with dot notation. preferences.theme
, preferences.notification
in my snowflake table preferences
is a variant column holding json data. the hobbies
column is an array.
I get this error for obvious reasons Number of columns in file (11) does not match that of the corresponding table (9)
I cannot find any information on how to achieve this. I've also tried formatting the CSV columns such that they are string representation of json like this
id,name,age,email,country,score,subscribed,preferences,hobbies
1,AbcDefGhiJ,24,[email protected],USA,95.5,true,"{""theme"": ""dark"", ""notifications"": true}","[""reading"", ""traveling""]"
2,JklMnoPqrS,37,[email protected],Canada,88.3,false,"{""theme"": ""light"", ""notifications"": false}","[""sports"", ""music""]"
3,TuvWxyZabC,29,[email protected],UK,92.0,true,"{""theme"": ""dark"", ""notifications"": true}","[""cooking"", ""hiking""]"
4,DefGhiJklM,45,[email protected],Australia,85.7,false,"{""theme"": ""light"", ""notifications"": true}","[""gardening"", ""photography""]"
5,PqrStuVwxY,53,[email protected],Germany,90.2,true,"{""theme"": ""dark"", ""notifications"": false}","[""writing"", ""fishing""]"
But get the same issue as it's handling it as though there are 11 columns not 9
The only thing that obviously works is actually treating the dot notations as their own columns. But there are a variable number of them that are not predictable. [this is all sample data]
I tried a few different CSV formats to get these VARIANT and ARRAY columns to behave properly with no luck.
Upvotes: 1
Views: 313
Reputation: 59275
To make sure you're formatting things in the right way, export first and then mimic that format.
I ran this test to make sure everything works as you want:
create table delete_test2 as
select 1 a, 2 b, 'hi' c, {'an':'object', 'more':4} obj;
copy into @fh_gcp_stage/delete_test2.csv
from delete_test2
file_format = (type=csv)
;
COPY INTO delete_test2 FROM @fh_gcp_stage/delete_test2.csv FILE_FORMAT = (TYPE = CSV ) ;
select *
from delete_test2;
That worked as expect, and the object was loaded into the variant. This is how the exported CSV looks:
1,2,hi,{"an":"object"\,"more":4}
As I loaded the exported row into the existing table, we expect to see the same row twice:
Furthermore, to test that the dot notation works with more complex objects:
insert into delete_test2
select 3, 4, 'h2', {'obj1':{'obj2':'layers'}};
copy into @fh_gcp_stage/delete_test2b.csv
from delete_test2
file_format = (type=csv);
COPY INTO delete_test2 FROM @fh_gcp_stage/delete_test2b.csv FILE_FORMAT = (TYPE = CSV ) ;
select *, obj:obj1.obj2
from delete_test2;
Upvotes: 0