Reputation: 81
I have a data file in a BLOB and I am trying to create an external table from it. Here's how the file looks in a text editor:
"ABC"|"123"|"ID1"
"XYZ"|"456"|"ID2"
When I create an external table with a pipe as the delimiter and then query the individual columns, I see these values:
value:c1|value:c2|value:c3
"\"ABC\"" "\"123\"" "\"ID1\""
"\"XYZ\"" "\"345\"" "\"ID2\""
Is there something that I need to add to the field_optionally_enclosed_by
option? I am easily able to do
select replace(value:c1,'\\"'),replace(value:c2,'\\"'),replace(value:c3,'\\"') from testable
and this gives me back
value:c1|value:c2|value:c3
"ABC" "123" "ID1"
"XYZ" "345" "ID2"
Any suggestions?
Upvotes: 2
Views: 5060
Reputation: 1510
Please try below and see if it helps, it works in my test:
create or replace external table external_table
with location = @test.ericlin_s3_stage/t1/
auto_refresh = true
REFRESH_ON_CREATE = true
FILE_FORMAT = (
TYPE=CSV
COMPRESSION=NONE
REPLACE_INVALID_CHARACTERS = TRUE
FIELD_DELIMITER = '|'
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
ESCAPE_UNENCLOSED_FIELD = '\\'
);
select
value:c1, value:c2, value:c3
from external_table;
+----------+----------+----------+
| VALUE:C1 | VALUE:C2 | VALUE:C3 |
|----------+----------+----------|
| "ABC" | "123" | "ID1" |
| "XYZ" | "456" | "ID2" |
+----------+----------+----------+
select
value:c1::string as c1,
value:c2::string as c2,
value:c3::string as c3
from external_table;
+-----+-----+-----+
| C1 | C2 | C3 |
|-----+-----+-----|
| ABC | 123 | ID1 |
| XYZ | 456 | ID2 |
+-----+-----+-----+
Upvotes: 1
Reputation: 25903
If your data is always fixed width, and you are want independent values for writting to the external table, then SPLIT and a nested select could help here:
SELECT
get(s,0) as a,
get(s,1) as b,
get(s,2) as c
FROM (
SELECT split(column1,'|') as s
FROM VALUES
('"ABC"|"123"|"ID1"'),
('"XYZ"|"456"|"ID2"')
);
gives:
A | B | C |
---|---|---|
""ABC"" | ""123"" | ""ID1"" |
""XYZ"" | ""456"" | ""ID2"" |
this look funning because the output layer has escaped the inner double quotes. So you could use TRIM to drop those:
SELECT
trim(get(s,0),'"') as a,
trim(get(s,1),'"') as b,
trim(get(s,2),'"') as c
FROM (
SELECT split(column1,'|') as s
FROM VALUES
('"ABC"|"123"|"ID1"'),
('"XYZ"|"456"|"ID2"')
);
A | B | C |
---|---|---|
ABC | 123 | ID1 |
XYZ | 456 | ID2 |
and if you dislike the sub-select you can use SPLIT_PART
SELECT
trim(split_part(column1,'|',0),'"') as a,
trim(split_part(column1,'|',1),'"') as b,
trim(split_part(column1,'|',2),'"') as c
FROM VALUES
('"ABC"|"123"|"ID1"'),
('"XYZ"|"456"|"ID2"')
;
Upvotes: 0