Stefano Borini
Stefano Borini

Reputation: 143935

Python library to detect if a file has changed between different runs?

Suppose I have a program A. I run it, and performs some operation starting from a file foo.txt. Now A terminates.

New run of A. It checks if the file foo.txt has changed. If the file has changed, A runs its operation again, otherwise, it quits.

Does a library function/external library for this exists ?

Of course it can be implemented with an md5 + a file/db containing the md5. I want to prevent reinventing the wheel.

Upvotes: 10

Views: 13125

Answers (4)

Sufian
Sufian

Reputation: 8755

It's unlikely that someone made a library for something so simple. Solution in 13 lines:

import pickle
import hashlib
try:
    l = pickle.load(open("db"))
except IOError:
    l = []
db = dict(l)
path = "/etc/hosts"
checksum = hashlib.md5(open(path).read().encode())
if db.get(path, None) != checksum:
    print("file changed")
    db[path] = checksum
pickle.dump(db.items(), open("db", "w"))

Upvotes: 10

James Nelson
James Nelson

Reputation: 781

FYI - for those using this example who got this error: "TypeError: can't pickle HASH objects" Simply modify the following (optionally update md5 to hashlib, md5 is deprecated):

    import pickle
    import hashlib #instead of md5
    try:
        l = pickle.load(open("db"))
    except IOError:
        l = []
    db = dict(l)
    path = "/etc/hosts"
    #this converts the hash to text
    checksum = hashlib.md5(open(path).read()).hexdigest() 
    if db.get(path, None) != checksum:
        print "file changed"
        db[path] = checksum
    pickle.dump(db.items(), open("db", "w"))

so just change:

    checksum = hashlib.md5(open(path).read())

to

    checksum = hashlib.md5(open(path).read()).hexdigest()

Upvotes: 7

Nicholas Knight
Nicholas Knight

Reputation: 16045

This is one of those things that is both so trivial to implement and so app-specific that there really wouldn't be any point in a library, and any library intended for this purpose would grow so unwieldy trying to adapt to the many variations required, learning and using the library would take as much time as implementing it yourself.

Upvotes: 2

NM.
NM.

Reputation: 1929

Cant we just check the last modified date . i.e after the first operation we store the last modified date in the db , and then before running again we compare the last modified date of the file foo.txt with the value stored in our db .. if they differ ,we perform the operation again ?

Upvotes: 0

Related Questions