danD
danD

Reputation: 716

Need to select bucket name while reading from s3 stage

I am reading from S3 folder in snowflake via stage. The bucket in s3 have multiple folder (or object if we want to call it). The folder is on date basis in the bucket

date=2020-06-01
date=2020-06-02
date=2020-06-03
date=2020-06-04
date=2020-06-05

I am using the below query to read all the folder at once. which is just working fine.

select raw.$1:name name,
       raw.$1:id ID
from
@My_Bucket/student_date/
(FILE_FORMAT => PARQUET, 
 PATTERN =>'.*date=.*\gz.parquet') raw
;   

Now i want to select the folder name as well in my query, Is there a way to do it.

like the output to contain

name | id | date..

pleas suggest

Upvotes: 0

Views: 300

Answers (1)

Marius
Marius

Reputation: 208

Snowflake has a built-in metadata field that provides the full filename, including the path. You should be able to run the following query:

select raw.$1:name name,
       raw.$1:id ID,
       METADATA$FILENAME 
from
@My_Bucket/student_date/
(FILE_FORMAT => PARQUET, 
 PATTERN =>'.*date=.*\gz.parquet') raw
; 

I know you are after the date portion only, but once you have the filename, you can use the SPLIT_PART function to get the date part from the filename. e.g.

SPLIT_PART(METADATA$FILENAME, '/', 4)

Hope this helps.

Upvotes: 4

Related Questions