Reputation: 726
I'm trying to extract date (20200222) from filename while copyinto command in snowflake.
File Name
s3://hadoop/2020002/dbo.DOSSIER_TRANSPORT_20200222.csv.gz
snowflake query
SELECT regexp_substr(metadata$filename, '/(_((\-|\+)?[0-9]+(\.[0-9]+)?).)/' as data FROM '@PRD.OPE_STG
_CMD.SPX_PRD_CMD/' (file_format => 'OTS_TEST.OPA_STG_BENE.OTD_FORMAT', pattern => '.*dbo.DOSSIER_TRANSPORT.*') ;
I tried this regex but its not supporting in snowflake. Getting below error
100048 (2201B): Invalid regular expression: '/(_((-|+)?[0-9]+(.[0-9]+)?).)/', no argument for repetition operator: +
Upvotes: 2
Views: 3759
Reputation: 2850
You need to use double escapes with Snowflake. If you just select your regexp string you will get:
SELECT '/(_((\-|\+)?[0-9]+(\.[0-9]+)?).)/';
=> /(_((-|+)?[0-9]+(.[0-9]+)?).)/
which is exactly what the regexp function will get as an input argument as well.
With double escapes you get:
SELECT '/(_((\\-|\\+)?[0-9]+(\\.[0-9]+)?).)/';
=> /(_((\-|\+)?[0-9]+(\.[0-9]+)?).)/
which is what I believe you want.
The error you get is from the regexp (-|+)
; the +
operator needs a real argument...
Upvotes: 0
Reputation: 626689
Use
REGEXP_SUBSTR(metadata$filename, '_([0-9]+)[.]', 1, 1, 'c', 1)
This is the regex demo
The pattern matches _
, then captures one or more digits in Group 1 and then matches .
. Since the group_num
argument is 1
, the return value is Group 1 value.
Upvotes: 3