Reputation: 1173
I've created a modified watchdog example in order to monitor a file for .jpg photos that have been added to the specific directory in Windows.
import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
paths = []
xp_mode = 'off'
class FileHandler(FileSystemEventHandler):
def on_created(self, event):
if xp_mode == 'on':
if not event.is_directory and not 'thumbnail' in event.src_path:
print "Created: " + event.src_path
paths.append(event.src_path)
def on_modified(self, event):
if not event.is_directory and not 'thumbnail' in event.src_path:
print "Modified: " + event.src_path
paths.append(event.src_path)
if __name__ == "__main__":
path = 'C:\\'
event_handler = FileHandler()
observer = Observer()
observer.schedule(event_handler, path, recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observe r.stop()
observer.join()
One of the things that I have noticed that when a file is added, both on_created and on_modified is called! To combat this problem, I decided to only use the on_modified method. However, I am starting to notice that this also causes multiple callbacks, but this time to the on_modified method!
Modified: C:\images\C121211-0008.jpg
Modified: C:\images\C121211-0009.jpg
Modified: C:\images\C121211-0009.jpg <--- What?
Modified: C:\images\C121211-0010.jpg
Modified: C:\images\C121211-0011.jpg
Modified: C:\images\C121211-0012.jpg
Modified: C:\images\C121211-0013.jpg
I cannot figure out for the life of me why this is happening! It doesn't seem to be consistent either. If anyone could shed some light on this issue, it will be greatly appreciated.
There was a similar post, but it was for Linux: python watchdog modified and created duplicate events
Upvotes: 2
Views: 5478
Reputation: 434
A pretty robust workaround that fragile debouncing that works for my use case on Windows is to check the modification time of the file. Something like this works for me:
class MyHandler(FileSystemEventHandler):
def __init__(self):
self.times = {}
def on_modified(self, event):
try:
t = os.path.getmtime(event.src_path)
if event.src_path in self.times and t == self.times[event.src_path]:
# duplicate event
return
self.times[event.src_path] = t
except FileNotFoundError:
# file got deleted after event was triggered
try:
del self.times[event.src_path]
except KeyError:
pass
# continue processing event
You can call on_modify() from on_create() if you want a single callback for modify and create, and want the code to work on OSes (are there any?) where on_modify() isn't called from on_create().
Upvotes: 0
Reputation: 281835
When a process writes a file, it first creates it, then writes the contents a piece at a time.
What you're seeing is a set of events corresponding to those actions. Sometimes the pieces are written quickly enough that Windows only sends a single event for all of them, and other times you get multiple events.
This is normal... depending on what the surrounding code needs to do, it might make sense to keep a set
of modified pathnames rather than a list
.
Upvotes: 6