pythoner991
pythoner991

Reputation: 33

python filter files by modified time

I have the following code:

def filter_by_time(files):
    print "---List of log files to timecheck: "
    for f in files:
        print f, datetime.datetime.fromtimestamp(os.path.getmtime(f))
    print "------"

    mins = datetime.timedelta(minutes=int(raw_input("Age of log files in minutes? ")))
    print "Taking ", mins, "minutes"
    mins = mins.total_seconds()
    current = time.time()
    difftime = current - mins
    print "current time: ", datetime.datetime.fromtimestamp(current)
    print "logs from after: ", datetime.datetime.fromtimestamp(difftime)   

    for f in files:    
        tLog = os.path.getmtime(f)
        print "checking ", f, datetime.datetime.fromtimestamp(tLog)
        if difftime > tLog:
            print "difftime is bigger than tLog", "removing ", f
            files.remove(f)

    print "*****List of log files after timecheck"
    for f in files:
        print f, datetime.datetime.fromtimestamp(os.path.getmtime(f)) 
    print "******"  
    return files

And a sample number of log files. The output of the above code when I enter a few minutes is:

List of log files to timecheck: 

1copy2.log  11:59:40

1copy3.log  12:13:53

1copy.log  11:59:40

1.log  11:59:40

Age of log files in minutes? 5

Taking  0:05:00 minutes

current time:  2015-07-14 14:02:11.861755

logs from after:  2015-07-14 13:57:11.861755

checking  1 copy 2.log 2015-07-14 11:59:40

difftime is bigger than tLog removing  1copy2.log

checking  1copy.log 2015-07-14 11:59:40

difftime is bigger than tLog removing  1copy.log

List of log files after timecheck



1copy3.log 2015-07-14 12:13:53

1.log 2015-07-14 11:59:40



Collected: 1copy3.log

Collected: 1.log

As you can see the files it has collected are not correct. What its doing is checking the 4 files to see if any were modified in the last 5 mins. It removes 2 from the list but should have removed the 4 files.

(Made some edits to make easier to read)

Upvotes: 3

Views: 7742

Answers (2)

Nikhil Shinday
Nikhil Shinday

Reputation: 1256

Consider the filter function available in Python. I assume that your input is a list of files and a number of minutes in the past for which you'd like to check, and you want to filter files based for ones that were last modified in the time period between min minutes in the past and the present time.

import datetime, os
def files_after (files, min)
    lower_time_bound = datetime.datetime.now() - timedelta(minutes=min)
    return filter(lambda f: datetime.datetime.fromtimestamp(os.path.getmtime(f)) > lower_time_bound, files)

Upvotes: 5

Robᵩ
Robᵩ

Reputation: 168616

You are removing items from the list while you are iterating the list. The result is unpredictable.

Try iterating over a copy of the list like so:

for f in files[:]:    # Note the [:] after "files"
    tLog = os.path.getmtime(f)
    print "checking ", f, datetime.datetime.fromtimestamp(tLog)
    if difftime > tLog:
        print "difftime is bigger than tLog", "removing ", f
        files.remove(f)

Upvotes: 3

Related Questions