Reputation: 1661
I am using the Python watchdog module on a Windows 2012 server to monitor new files appearing on a shared drive. When watchdog notices the new file it kicks off a database restore process.
However, it seems that watchdog will attempt to restore the file the second it is created and not wait till the file has finished copying to the shared drive. So I changed the event to on_modified but there are two on_modified events, one when the file is initially being copied and one when it is finished being copied.
How can I handle the two on_modified events to only fire when the file being copied to the shared drive has finished?
What happens when multiple files are copied to the shared drive at the same time?
Here is my code
import time
import subprocess
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
class NewFile(FileSystemEventHandler):
def process(self, event):
if event.is_directory:
return
if event.event_type == 'modified':
if getext(event.src_path) == 'gz':
load_pgdump(event.src_path)
def on_modified(self, event):
self.process(event)
def getext(filename):
"Get the file extension"
file_ext = filename.split(".",1)[1]
return file_ext
def load_pgdump(src_path):
restore = 'pg_restore command ' + src_path
subprocess.call(restore, shell=True)
def main():
event_handler = NewFile()
observer = Observer()
observer.schedule(event_handler, path='Y:\\', recursive=True)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
if __name__ == '__main__':
main()
Upvotes: 14
Views: 17748
Reputation: 47
Following up to ravenwing's answer, more details can be found about on_closed
in watchdog
here.
As mentioned in the documented issue, there is no documentation available for on_closed
yet and it can only be used with unix.
Upvotes: 1
Reputation: 837
On linux you also get close event. Than solution would be to wait with processing file until file gets closed. My approach would be to add on_closed handling.
class Handler(FileSystemEventHandler):
def __init__(self):
self.files_to_process = set()
def dispatch(self, event):
_method_map = {
'created': self.on_created,
'closed': self.on_closed
}
def on_created(self, event):
self.files_to_process.add(event.src_path)
def on_closed(self, event):
self.files_to_process.remove(event.src_path)
actual_processing(event.src_path)
Upvotes: 3
Reputation: 108
Old I know, but I recently came up with a solution for this exact problem. In my case, I was only concerned with wav and mp3 files. This function will ensure that only files that are completely copied will be sent to makerCore() because the created placeholder files do not have any extension and will always end up in 'not ready'. Once the file is completed it will trigger the watchdog module again except this time with an extension. This will work on multiple files simultaneously as well.
def on_created(event):
#print(event)
if str(event.src_path).endswith('.mp3') or str(event.src_path).endswith('.wav'):
makerCore(event)
else:
print('not ready')
Upvotes: 1
Reputation: 1
I've tried the check filesize - wait - check again routine many have suggested above but it's not very reliable. To make it work better I've added a check if the file is still locked.
file_done = False
file_size = -1
while file_size != os.path.getsize(file_path):
file_size = os.path.getsize(file_path)
time.sleep(1)
while not file_done:
try:
os.rename(file_path, file_path)
file_done = True
except:
return True
Upvotes: 0
Reputation: 161
This works for me. Tested in windows as well with python3.7
while True:
size_now = os.path.getsize(event.src_path)
if size_now == size_past:
log.debug("file has copied completely now size: %s", size_now)
break
# TODO: why sleep is not working here ?
else:
size_past = os.path.getsize(event.src_path)
log.debug("file copying size: %s", size_past)
Upvotes: 1
Reputation: 2682
I am using a different approach that might not be the most elegant one but is easy to do on any plateform if you have control on the side copying the file.
Just had 'in-progress' to the name of the file until the copying is complete, and then rename the file. You can then have a while loop waiting for the file with the name without 'in-progress' to exist and you're good.
Upvotes: 0
Reputation: 1390
I'm using following code to wait until file copied (for Windows only):
from ctypes import windll
import time
def is_file_copy_finished(file_path):
finished = False
GENERIC_WRITE = 1 << 30
FILE_SHARE_READ = 0x00000001
OPEN_EXISTING = 3
FILE_ATTRIBUTE_NORMAL = 0x80
if isinstance(file_path, str):
file_path_unicode = file_path.decode('utf-8')
else:
file_path_unicode = file_path
h_file = windll.Kernel32.CreateFileW(file_path_unicode, GENERIC_WRITE, FILE_SHARE_READ, None, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, None)
if h_file != -1:
windll.Kernel32.CloseHandle(h_file)
finished = True
print 'is_file_copy_finished: ' + str(finished)
return finished
def wait_for_file_copy_finish(file_path):
while not is_file_copy_finished(file_path):
time.sleep(0.2)
wait_for_file_copy_finish(r'C:\testfile.txt')
The idea is to try open a file for write with share read mode. It will fail if someone else is writing to it.
Enjoy ;)
Upvotes: 4
Reputation: 1622
In your on_modified event, just wait until the file is finished being copied, via watching the filesize.
Offering a Simpler Loop:
historicalSize = -1
while (historicalSize != os.path.getsize(filename)):
historicalSize = os.path.getsize(filename)
time.sleep(1)
print "file copy has now finished"
Upvotes: 12
Reputation: 2034
I had a similar issue recently with watchdog. A rather simple but not very smart workaround was for me to check the change of file size in a while loop using a two-element list, one for 'past', one for 'now'. Once the the values are equal the copying is finished.
Edit: something like this.
past = 0
now = 1
value = [past, now]
while True:
# change
# test
if value[0] == value[1]:
break
else:
value = [value[1], value[0]]
Upvotes: 1
Reputation: 744
I would add a comment as this isn't an answer to your question but a different approach... but I don't have enough rep yet. You could try monitoring filesize, if it stops changing you can assume copy has finished:
copying = True
size2 = -1
while copying:
size = os.path.getsize('name of file being copied')
if size == size2:
break
else:
size2 = os.path.getsize('name of file being copied')
time.sleep(2)
Upvotes: 3