Reputation: 517
I'm looking at the following OCaml code, which attempts to open a subprocess, feed it some input, and then collect all of the output it produces on either stdout or stderr. But the fact that it first reads everything from stdout and then reads everything from stderr seems fishy -- if the subprocess happens to write a lot of stuff to stderr, it seems like the result will be a deadlock.
let rec output_lines (output : string list) (chan : out_channel) : unit =
ignore (List.map (output_string chan) output); flush chan
let async_command
(name : string)
(arguments : string list)
: (in_channel * out_channel * in_channel) =
Unix.open_process_full
(name ^ " " ^ (String.concat " " arguments))
(Unix.environment ())
let sync_command
(name : string)
(arguments : string list)
(input : string list) : (string list * string list) =
let (o, i, e) = async_command name arguments in
output_lines input i;
let out, err = (input_lines o, input_lines e) in
let status = Unix.close_process_full (o, i, e) in
begin match status with
| Unix.WEXITED x -> if x != 0 then raise (Shell_error (unlines err))
| Unix.WSIGNALED x -> if x = Sys.sigint then raise Sys.Break
| Unix.WSTOPPED x -> if x = Sys.sigint then raise Sys.Break
end;
(out, err)
How should this be fixed? (Better yet, what library should I use where this functionality is already implemented?)
Upvotes: 1
Views: 1554
Reputation: 35210
In general, you should use some sort of polling over a set of descriptors along with the non-blocking input to pipe information from the to the output. This will, of course, garbage the output, i.e., intermix it in an arbitrary order. The polling can be performed with the Unix.select
function. A non-blocking io can be enabled by setting the O_NONBLOCK
flag when a file is opened, or with the Unix.set_nonblock
function.
With all these said, I would like to stress that writing non-blocking code is not a trivial exercise. Especially if it is done in raw, poll/read/write loop. Many modern languages/runtimes have libraries that decouple this cycle and provide a generic interface in the form of callbacks. OCaml was one of the pioneers in this direction, with its Lwt library. We also have the Async library, that is slightly bigger but shares the same design. I wouldn't advise one or another, as it is opinion based, but, as an anecdote, your namesake Benjamin Pierce was using Lwt for his Unison project :)
We can burrow an example from the Unix System Programming book
(* The multiplex function takes a descriptor open on the serial
port and two arrays of descriptors of the same size, one containing
pipes connected to the standard input of the user processes, the other
containing pipes connected to their standard output. *)
open Unix;;
let rec really_read fd buff start length =
if length <= 0 then () else
match read fd buff start length with
| 0 -> raise End_of_file
| n -> really_read fd buff (start+n) (length-n);;
let buffer = String.create 258;;
let multiplex channel inputs outputs =
let input_fds = channel :: Array.to_list inputs in
try
while true do
let (ready_fds, _, _) = select input_fds [] [] (-1.0) in
for i = 0 to Array.length inputs - 1 do
if List.mem inputs.(i) ready_fds then begin
let n = read inputs.(i) buffer 2 255 in
buffer.[0] <- char_of_int i;
buffer.[1] <- char_of_int n;
ignore (write channel buffer 0 (n+2));
()
end
done;
if List.mem channel ready_fds then begin
really_read channel buffer 0 2;
let i = int_of_char(buffer.[0])
and n = int_of_char(buffer.[1]) in
if n = 0 then close outputs.(i) else
begin
really_read channel buffer 0 n;
ignore (write outputs.(i) buffer 0 n);
()
end
end
done
with End_of_file -> () ;;
Note: You should use create_subprocess
that returns the file descriptors, to implement non blocking IO.
Upvotes: 2