Reputation: 1909
My Python program needs to multiplex reads from several different file descriptors. Some of them are the stdout/stderr descriptors of subprocesses; others are the file descriptors associated with inotify
calls.
My problem is being able to do a "non-blocking"[1] read after select()
. According to the documentation, sockets that select()
reports to be ready for writes "are guaranteed to not block on a write of up to PIPE_BUF bytes".
I suppose that no such guarantee makes sense with a read, as select()
reporting that there is data waiting to be ready in the kernel pipe buffer doesn't mean that you can go ahead and to .read(socket.PIPE_BUF)
, as there could be just a few bytes in there.
This means that when I'm calling read()
on the socket, I can get what is effectively a deadlock as some of the subprocesses produce output very rarely.
Is there any way around this? My current workaround is to call readline()
on it, and I'm lucky enough that everything I'm reading from has line-by-line output. Is select()
of any use at all when reading from a pipe like this, seeing as there's no way to know how many bytes you can safely read without blocking?
[1] I'm aware that this is distinct from an O_NONBLOCK socket
Upvotes: 0
Views: 1089
Reputation: 817
Just as an alternative, I ran into exactly the same problem and solved it by using readline(1)
and appending that to an internal buffer until readline returned a character that I was interested in tokenizing on (newline, space, etc.).
More detail: I called select()
on a file descriptor and then called readline(1)
on any file descriptor that was returned by select, appended that char to a buffer, and repeated until readline returned what I wanted. Then I returned my buffer, cleared it and moved on. Incidentally, I also returned a Boolean that let the calling method know if the data that I was returning was empty because of a bad read of just because it wasn't done.
I also implemented a version that would tokenize on a timeout. If I'd been buffering for x ms without finding a newline or EOF, go ahead and return the buffer.
I'm currently trying to find out if there's a way to ask a file descriptor how many bytes it has waiting to be read, then just readline([that many bytes])
...
Hope that helps.
Upvotes: 0
Reputation: 489253
It's OK to go ahead and read
each pipe and socket: you'll get whatever data are available now:
>>> import os
>>> desc = os.pipe()
>>> desc
(3, 4)
>>> os.write(desc[1], 'foo')
3
>>> os.read(desc[0], 100)
'foo'
>>> os.read(desc[0], 100)
[hangs here as there's no input available, interrupt with ^C]
...
KeyboardInterrupt
>>> os.write(desc[1], 'a')
1
>>> os.read(desc[0], 100)
'a'
>>>
Upvotes: 3