JTee
JTee

Reputation: 9

Why is my condition in python not being met

I have a list of filenames sorted by creation date. These files contain a datetime in the filename for their creation date time. I am attempting to create a sub list for all files after a certain time.

Full list of files -

Allfilenames = ['CCN-200 data 130321055347.csv',
'CCN-200 data 130321060000.csv',
'CCN-200 data 130321063235.csv',
'CCN-200 data 130321070000.csv',
'CCN-200 data 130321080000.csv',
'CCN-200 data 130321090000.csv',
'CCN-200 data 130321100000.csv',
'CCN-200 data 130321110000.csv',
'CCN-200 data 130321120000.csv',
'CCN-200 data 130321130000.csv',
'CCN-200 data 130321140000.csv',
'CCN-200 data 130321150000.csv']

positions [19:24] give the time in format hhmmss. I am using

filenames = [s for s in Allfilenames if os.path.basename(s)[19:24] >= TOffRound]

TOffRound = "080000"

The result should be a list of all filenames created on or after or 08:00:00, however the resulting list is missing the "080000" file.

filenames = ['CCN-200 data 130321090000.csv',
'CCN-200 data 130321100000.csv',
'CCN-200 data 130321110000.csv',
'CCN-200 data 130321120000.csv',
'CCN-200 data 130321130000.csv',
'CCN-200 data 130321140000.csv',
'CCN-200 data 130321150000.csv']

Why is the conditional not returning true on the = part of the condition and returning 'CCN-200 data 130321080000.csv' in my list? Please note I have only shown the basename here for clarity.

Upvotes: 0

Views: 75

Answers (3)

thiruvenkadam
thiruvenkadam

Reputation: 4260

The problem with the code given, as suggested by others, is that you are missing the last digit. In terms of slicing a list, the "stop" number given after the : is not considered.

(eg):
>> a = "hello world"
>> print a[0:4]
hell
>> print a[0:5]
hello

So, change this line in your code and you are good to go:

filenames = [s for s in Allfilenames if os.path.basename(s)[19:25] >= TOffRound]

However, what you are doing does not scale at all. This is not easier to maintain nor work with any file that is a even a slightly different. The code can be transformed like this:

def filter_files(file_list, TOffRound):
    text_length = len(TOffRound)
    return [file_name for file_name in file_list if file_name[-text_length:] >= TOffRound]

This will work, irrespective of the size of the file name.

I would also suggest you to get the list of files based on their modification time, that can be taken using os.stat or os.path.getmtime, and act accordingly, rather than using the file name. File name is a string and even though it can support you with older or newer files, it is generally, not a good idea to use that way. You are converting a time stamp to string for the file name. Then this string is converted back to time stamp and convert in the normal case. Instead, if you go for file modification time, you can stay only with the date and time formats rather than the conversions that need be done. This has few advantages:

  • File name or any explicit parameter can change over time but you need not change the logic again and again
  • File based time stamps do exist for these kind of purposes. So they do provide more control. For instance, If you wish to select files of a certain range, created or modified only at a specific time period? Easy to do with file time stamps.
  • This splits the time logic from the file names and thus you can name them more meaningfully regarding their purposes thereby simplifying the maintenance of the code over a period of time.

Upvotes: 0

MervS
MervS

Reputation: 5902

Instead of checking the time part as a string, I would suggest a stronger method to test the time part of your filename. This includes extracting the date part of the filename, retrieving the time value and comparing it on your specified time as a time object.

import re
import datetime

TOffRound = datetime.time(8, 0)
filenames = []

for s in Allfilenames:
  datestr = re.search("[\d]{12}", s).group(0)
  dateobj = datetime.datetime.strptime(datestr,"%y%m%d%H%M%S")
  timeobj = dateobj.time()
  if timeobj >= TOffRound:
    filenames.append(s)

Upvotes: 1

Tulashi Gautam
Tulashi Gautam

Reputation: 64

In your filenames hhmmss exist from index 19:25 rather than 19:24. So the correct statement to get the hhmmss from filename is:

filenames = [s for s in Allfilenames if os.path.basename(s)[19:25] >= TOffRound]

Upvotes: 0

Related Questions