SharpLu
SharpLu

Reputation: 1214

How to define python file name filter pattern

I have a question regarding the Python file names pattern define regulations.

I have a bunch of files below. I need to filter out those files by date pattern, I don't want use split by ('_').

My solution below

src_format = "TrackLog_%Y%m%d_*.csv"
dst_exist_dates = [datetime.strptime(f, 'TrackLog_%Y%m%d_*.csv') for f in s3.list_files_as_list(s3_profile, src_path, recursive=True)]


2017-10-05 04:23:39  969083134 TrackLog_20171004_070602.csv
2017-10-06 04:23:52  986127990 TrackLog_20171005_070555.csv
2017-10-07 04:26:09  991914033 TrackLog_20171006_070929.csv
2017-10-08 04:24:51  996154180 TrackLog_20171007_070656.csv
2017-10-09 04:23:02  998725794 TrackLog_20171008_070647.csv
2017-10-10 04:24:49 1002421079 TrackLog_20171009_070550.csv
2017-10-11 04:25:51 1008595262 TrackLog_20171010_070553.csv
2017-10-12 04:24:04 1015542121 TrackLog_20171011_070555.csv
2017-10-13 04:24:06 1041053623 TrackLog_20171012_070620.csv
2017-10-14 04:26:59 1049256243 TrackLog_20171013_070929.csv

But I got the exception, it looks like my pattern is not right, does anyone can help me? Thank you very much

ValueError: time data 'TrackLog_20160315_123456.csv' does not match format 'TrackLog_%Y%m%d_*.csv'

Upvotes: 0

Views: 78

Answers (2)

Rakesh
Rakesh

Reputation: 82765

You can use string slicing.

Ex:

import datetime
s = 'TrackLog_20160315_123456.csv' 
print( datetime.datetime.strptime(s[9:17], '%Y%m%d') )

In your case:

dst_exist_dates = [datetime.strptime(f[9:17], '%Y%m%d') for f in s3.list_files_as_list(s3_profile, src_path, recursive=True)]

Output:

2016-03-15 00:00:00

Upvotes: 1

blhsing
blhsing

Reputation: 106455

strptime does not support glob patterns, so you can't use * as a wildcard. Instead, use re.match to obtain the date part of the file name, and then let strptime parse the date.

Change the parsing line to the following (after importing re):

dst_exist_dates = [datetime.strptime(re.match(r'[^_]+_([^_]+)_', f).group(1), '%Y%m%d') for f in s3.list_files_as_list(s3_profile, src_path, recursive=True)]

Upvotes: 0

Related Questions