Reputation: 16523
I have a Python 2.7.x process running in an infinite loop that monitors a folder in Ubuntu server.
Whenever it finds a file, it checks the file against a set of known files that have been processed already, and acts accordingly. In pseudocode:
found = set()
while True:
for file in all_files("<DIR>"):
if file not in found:
process_file(file, found)
How can I make sure that the file hasn't just begun being copied there? I wouldn't want to say, take MD5 sum of file or open it with another process until I'm sure it's all there and ready.
Upvotes: 0
Views: 141
Reputation: 34426
The safest solution is to use the Linux kernel's inotify
API via the pyinotify library. Experiment with the IN_CREATE
and IN_MOVED_TO
events depending on your needs. Also note this blog post warning of some implementation problems with the pyinotify library.
Upvotes: 2
Reputation: 7930
Due to locks and other system-level operations, you will not be able to do anything to the file until it has completed copying.
A file cannot be in two operations at once.
Upvotes: 2