user1927638
user1927638

Reputation: 1173

Python watchdog duplicate events

I've created a modified watchdog example in order to monitor a file for .jpg photos that have been added to the specific directory in Windows.

import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler

paths = []

xp_mode = 'off'

class FileHandler(FileSystemEventHandler):

    def on_created(self, event):
        if xp_mode == 'on':
            if not event.is_directory and not 'thumbnail' in event.src_path:
                print "Created: " + event.src_path
                paths.append(event.src_path)

    def on_modified(self, event):
        if not event.is_directory and not 'thumbnail' in event.src_path:
            print "Modified: " + event.src_path
            paths.append(event.src_path)

if __name__ == "__main__":
    path = 'C:\\'
    event_handler = FileHandler()
    observer = Observer()
    observer.schedule(event_handler, path, recursive=True)
    observer.start()
    try:
        while True:
            time.sleep(1)
    except KeyboardInterrupt:
        observe r.stop()

    observer.join()

One of the things that I have noticed that when a file is added, both on_created and on_modified is called! To combat this problem, I decided to only use the on_modified method. However, I am starting to notice that this also causes multiple callbacks, but this time to the on_modified method!

Modified: C:\images\C121211-0008.jpg
Modified: C:\images\C121211-0009.jpg
Modified: C:\images\C121211-0009.jpg <--- What?
Modified: C:\images\C121211-0010.jpg
Modified: C:\images\C121211-0011.jpg
Modified: C:\images\C121211-0012.jpg
Modified: C:\images\C121211-0013.jpg

I cannot figure out for the life of me why this is happening! It doesn't seem to be consistent either. If anyone could shed some light on this issue, it will be greatly appreciated.

There was a similar post, but it was for Linux: python watchdog modified and created duplicate events

Upvotes: 2

Views: 5478

Answers (2)

Alexander Pruss
Alexander Pruss

Reputation: 434

A pretty robust workaround that fragile debouncing that works for my use case on Windows is to check the modification time of the file. Something like this works for me:

class MyHandler(FileSystemEventHandler):
    def __init__(self):
        self.times = {}

    def on_modified(self, event):
            try:
                t = os.path.getmtime(event.src_path)
                if event.src_path in self.times and t == self.times[event.src_path]:
                    # duplicate event
                    return
                self.times[event.src_path] = t
            except FileNotFoundError:
                # file got deleted after event was triggered
                try:
                    del self.times[event.src_path]
                except KeyError:
                    pass
            # continue processing event

You can call on_modify() from on_create() if you want a single callback for modify and create, and want the code to work on OSes (are there any?) where on_modify() isn't called from on_create().

Upvotes: 0

RichieHindle
RichieHindle

Reputation: 281835

When a process writes a file, it first creates it, then writes the contents a piece at a time.

What you're seeing is a set of events corresponding to those actions. Sometimes the pieces are written quickly enough that Windows only sends a single event for all of them, and other times you get multiple events.

This is normal... depending on what the surrounding code needs to do, it might make sense to keep a set of modified pathnames rather than a list.

Upvotes: 6

Related Questions