KristiLuna
KristiLuna

Reputation: 1903

Snowflake LIKE not populating rows

When I run the below query:

SELECT load_dt, load_fname, exec_date FROM my_table
WHERE TO_DATE(exec_date) = '2021-10-18'

I see 3k rows that look like the below:

load_dt load_fname exec_date
2021-10-19 10:24:25.681 -0400 FK/pi/2021-10-18/post_date=2021-10-12/2021-10-12_1247120.csv 2021-10-18 09:00:00.000
2021-10-19 10:24:18.543 -0400 FK/pi/2021-10-18/post_date=2021-10-12/2021-10-12_3154712.csv 2021-10-18 09:00:00.000

I want to delete the 3k rows so I can reload the correct data based on the load_fname, which is the file location in s3

please note I'm getting the s3 file location using airflow:

s3_full_path = self.get_partitioned_location(date).replace(
            f's3://{AWS_S3_BUCKET}/', '')

which is what I use in my query below after LIKE, it changes depending on date: FK/pi/2021-10-18/post_date=2021-10-12/

so I tried this but it came up empty:

SELECT  load_dt, load_fname FROM my_table
WHERE load_fname like 'FK/pi/2021-10-18/post_date=2021-10-12/'

Upvotes: 4

Views: 121

Answers (1)

Robert Long
Robert Long

Reputation: 6812

From the description, you want to match rows where the column load_fname begins with the following:

FK/pi/2021-10-18/post_date=2021-10-12/

So, that means you want to match all of

FK/pi/2021-10-18/post_date=2021-10-12/
FK/pi/2021-10-18/post_date=2021-10-12/2
FK/pi/2021-10-18/post_date=2021-10-12/2021
FK/pi/2021-10-18/post_date=2021-10-12/2021-10-12
FK/pi/2021-10-18/post_date=2021-10-12/2021-10-12_987654321.xyz
FK/pi/2021-10-18/post_date=2021-10-12/2021-10-12_ABCDEFGHIJKLMNOPQ.abc
FK/pi/2021-10-18/post_date=2021-10-12/2021-10-12_1247120.csv
FK/pi/2021-10-18/post_date=2021-10-12/2021-10-12_3154712.csv

where the last 2 are the ones in your data extract. To acheive this, we simply add the wildcard % to the end of the string:

WHERE load_fname like 'FK/pi/2021-10-18/post_date=2021-10-12/%'

The wildcard literally means - match rows where % could be nothing, or any string.

As for the question in the comments:

I noticed adding "%" to the beginning of the string (like Dean Flinter mentioned) also works and seems to give me the same rows as "%" only at the end. Why is this?

based on the meaning of the wildcard (match rows where % could be nothing, or any string) so if the string to match was:

%FK/pi/2021-10-18/post_date=2021-10-12/%

then this would, in addition to mathing all those I wrote above, ALSO match any string that starts with anything whatsoever of any length (or nothing at all) and is then followed by FK/pi/2021-10-18/post_date=2021-10-12/ plus anything matched by the trailing %. So it would match, for example:

FK/pi/2021-10-18/post_date=2021-10-12/
FK/pi/2021-10-18/post_date=2021-10-12/abcde
abcde/FK/pi/2021-10-18/post_date=2021-10-12/
abcde/FK/pi/2021-10-18/post_date=2021-10-12/abcde

There are other wildcards you can use. For example. If you knew that there are exactly 4 characters AFTER the FK/pi/2021-10-18/post_date=2021-10-12/ then you could use

WHERE load_fname LIKE 'FK/pi/2021-10-18/post_date=2021-10-12/____ 

where we have used 4 underscore wild cards, each of which will match 1 and only 1 characters. So this would match:

FK/pi/2021-10-18/post_date=2021-10-12/2021
FK/pi/2021-10-18/post_date=2021-10-12/2022
FK/pi/2021-10-18/post_date=2021-10-12/2023
FK/pi/2021-10-18/post_date=2021-10-12/ABCD
FK/pi/2021-10-18/post_date=2021-10-12/abcd

but not

FK/pi/2021-10-18/post_date=2021-10-12/2021-10-12
FK/pi/2021-10-18/post_date=2021-10-12/A
FK/pi/2021-10-18/post_date=2021-10-12/

Upvotes: 1

Related Questions