Reputation: 11883
Typically the transcode of my 1 hr long audio recording sessions to an mp3 file takes twenty odd minutes.
I want to use a python script to execute a series of python code when the OSX application garageband finishes writing that mp3 file.
What are the best ways in python to detect that an external application is done writing data to a file and closed that file. I read about kqueue and epoll, but since I have no background in os event detection and couldnt find a good example I am asking for one here.
The code I am using right now does the following and I am looking for something more elegant.
while True:
try:
today_file = open("todays_recording.mp3","r")
my_custom_function_to_process_file(today_file)
except IOError:
print "File not ready yet..continuing to wait"
Upvotes: 3
Views: 934
Reputation: 4232
The answer is a bit nuanced.
Technically, you can wait for data to be written to a file like this:
from select import kqueue, kevent, KQ_FILTER_VNODE, KQ_NOTE_WRITE
fh = open('test.txt', 'r')
kqh = kqueue()
res = kqh.control([kevent(fh, KQ_FILTER_VNODE, fflags=KQ_NOTE_WRITE)], 1, 10)
print(res)
After a write is made to test.txt
, res
will be returned containing one or more events. If 10 seconds go by without a write being detected, an empty array will be returned in res
. If the watch queue overflows, which is extremely unlikely in this case as you're not watching for huge numbers of events on e.g. a directory of thousands of files, an overflow event will be returned in res
.
But is that a good idea?
Watching for writes to determine when an external program is "done" with a file is generally a poor idea. Just because a write occurs doesn't mean that the program (GarageBand in this case) isn't going to make any more writes. Worse, from the perspective of that other program, its function call to write data may have already succeeded, but underlying buffering (at the language platform or OS level) may cause the notification to be delivered later or not at all in some rare cases.
If all you watch for is writes, then you end up having to poll the file to see if it's completely written (reinventing the code you already have) or use fallible and over-specific heuristics to guess when the writer is done (e.g. "writes were observed, then no more writes for 1 second; that probably means the writer is finished ... or else it means the computer's under heavy load and GarageBand is taking a long time to encode the next chunk of MP3 data before writing it").
That brings us to the second part of your inner question: can you watch for external programs to close a file? If we can do that, we can get a much more reliable hint that the external program is done working with a given file.
The answer is yes ... but not easily in Python, and not at all on MacOS.
kqueue(3)
supports the NOTE_CLOSE
and NOTE_CLOSE_WRITE
fflags, which fire when a reader or writer handle to a file is closed. However, the Python stdlib doesn't supply those flags in the select
module in the latest version as of the time of this writing (3.12).
Fortunately, this is an old BSD API and unlikely to change, so grabbing the raw value of those flags from the kernel source a BSD (I found them in the NetBSD source) is easy:
#define NOTE_CLOSE 0x0100U /* file closed (no FWRITE) */
#define NOTE_CLOSE_WRITE 0x0200U /* file closed (FWRITE) */
Those values are 256 and 512 in unsigned (and signed) 16/32bit integers, so we should be able to wait for them manually, like this:
from select import kqueue, kevent, KQ_FILTER_VNODE
KQ_NOTE_CLOSE = 256
KQ_NOTE_CLOSE_WRITE = 512
fh = open('test.txt', 'r')
kqh = kqueue()
res = kqh.control([kevent(fh, KQ_FILTER_VNODE, fflags=KQ_NOTE_CLOSE_WRITE | KQ_NOTE_CLOSE)], 1, 10)
print(res)
However, that doesn't work (doesn't wake up when other programs close the file), because NOTE_CLOSE
and NOTE_CLOSE_WRITE
are not available on MacOS. Unfortunately, it doesn't seem like the MacOS-native FSEvents file monitoring API publishes events relating to file closure either.
The verdict is that this is not possible on MacOS, but is likely possible (with a bit of unauthorized mucking about with select.kqueue
internal flags) on other BSDs.
Upvotes: 0
Reputation: 678
You could popen lsof and filter by either the process or file you're interested in...
Upvotes: 1