Reputation: 23
Let's assume I have .csv file like:
event,user
1,123
2,321
This .csv file is located under s3.
Running the following sql to create an external table(with @TEST_STAGE created and has correct s3 path):
CREATE OR REPLACE EXTERNAL TABLE TEST_CSV_TABLE1(
event_id VARCHAR AS (value:$1::varchar),
user_id VARCHAR AS (value:$2::varchar)
)
WITH LOCATION = @TEST_STAGE
FILE_FORMAT = (TYPE = CSV FIELD_DELIMITER = ',' SKIP_HEADER = 1);
Querying the following table results in the following output:
|-----|----------------------------|----------|---------|
| Row | VALUE | EVENT_ID | USER_ID |
|-----|----------------------------|----------|---------|
| 1 | { "c1": "1", "c2": "123" } | NULL | NULL |
|-----|----------------------------|----------|---------|
| 2 | { "c1": "2", "c2": "321" } | NULL | NULL |
|-----|----------------------------|----------|---------|
However, if I just create a table as
CREATE OR REPLACE TABLE TEST_CSV_TABLE2(
event_id VARCHAR,
user_id VARCHAR
);
and load the same file like:
COPY INTO TEST_CSV_TABLE2 FROM @TEST_STAGE
FILES = ('test.csv')
FILE_FORMAT = (TYPE = CSV FIELD_DELIMITER = ',' SKIP_HEADER = 1);
or even like:
COPY INTO TEST_CSV_TABLE2
FROM (
SELECT
t.$1,
t.$2
FROM @ TEST_STAGE t)
FILES = ('test.csv')
FILE_FORMAT = (TYPE = CSV FIELD_DELIMITER = ',' SKIP_HEADER = 1);
This results into properly assigned columns:
|-----|----------|---------|
| Row | EVENT_ID | USER_ID |
|-----|----------|---------|
| 1 | 1 | 123 |
|-----|----------|---------|
| 2 | 2 | 321 |
|-----|----------|---------|
Why columns are not pick properly in case of external table? Many thanks ahead.
Upvotes: 2
Views: 6650
Reputation: 2880
You need to use the name of the column when you are pulling it out of the JSON. What you have is creating the JSON column, and then parsing it for attributes in the JSON called "$1" and "$2". When it doesn't find such an attribute it returns NULL to that column.
CREATE OR REPLACE EXTERNAL TABLE TEST_CSV_TABLE1(
event_id VARCHAR AS (value:c1::varchar),
user_id VARCHAR AS (value:c2::varchar)
)
WITH LOCATION = @TEST_STAGE
FILE_FORMAT = (TYPE = CSV FIELD_DELIMITER = ',' SKIP_HEADER = 1);
Using a copy into
with the $1
and $2
isn't using those to parse JSON like above, it's the syntax specific to a copy into
query to reference the columns in a file.
Upvotes: 6