jnome
jnome

Reputation: 41

ruby Process.spawn stdout => pipe buffer size limit

In Ruby, I'm using Process.spawn to run a command in a new process. I've opened a bidirectional pipe to capture stdout and stderr from the spawned process. This works great until the bytes written to the pipe (stdout from the command) exceed 64Kb, at which point the command never finishes. I'm thinking the pipe buffer size has been hit, and writes to the pipe are now blocked, causing the process to never finish. In my actual application, i'm running a long command that has lots of stdout that I need to capture and save when the process has finished. Is there a way to raise the buffer size, or better yet have the buffer flushed so the limit is not hit?

cmd = "for i in {1..6600}; do echo '123456789'; done"  #works fine at 6500 iterations.

pipe_cmd_in, pipe_cmd_out = IO.pipe
cmd_pid = Process.spawn(cmd, :out => pipe_cmd_out, :err => pipe_cmd_out)

Process.wait(cmd_pid)
pipe_cmd_out.close
out = pipe_cmd_in.read
puts "child: cmd out length = #{out.length}"

UPDATE Open3::capture2e does seem to work for the simple example I showed. Unfortunately for my actual application, I need to be able to get the pid of the spawned process as well, and have control of when I block execution. The general idea is I fork a non blocking process. In this fork, I spawn a command. I send the command pid back to the parent process, then I wait on the command to finish to get the exit status. When command is finished, exit status is sent back to parent. In the parent, a loop is iterating every 1 second checking the DB for control actions such as pause and resume. If it gets a control action, it sends the appropriate signal to the command pid to stop, continue. When the command eventually finishes, the parent hits the rescue block and reads the exit status pipe, and saves to DB. Here's what my actual flow looks like:

#pipes for communicating the command pid, and exit status from child to parent
pipe_parent_in, pipe_child_out = IO.pipe
pipe_exitstatus_read, pipe_exitstatus_write = IO.pipe

child_pid = fork do
    pipe_cmd_in, pipe_cmd_out = IO.pipe
    cmd_pid = Process.spawn(cmd, :out => pipe_cmd_out, :err => pipe_cmd_out)
    pipe_child_out.write cmd_pid  #send command pid to parent
    pipe_child_out.close
    Process.wait(cmd_pid)
    exitstatus = $?.exitstatus
    pipe_exitstatus_write.write exitstatus  #send exitstatus to parent
    pipe_exitstatus_write.close
    pipe_cmd_out.close
    out = pipe_cmd_in.read
    #save out to DB
end

#blocking read to get the command pid from the child
pipe_child_out.close
cmd_pid = pipe_parent_in.read.to_i

loop do
    begin
        Process.getpgid(cmd_pid)  #when command is done, this will except
        @job.reload #refresh from DB

        #based on status in the DB, pause / resume command
        if @job.status == 'pausing'
            Process.kill('SIGSTOP', cmd_pid)
        elsif @job.status == 'resuming'
            Process.kill('SIGCONT', cmd_pid)
        end
    rescue
        #command is no longer running
        pipe_exitstatus_write.close
        exitstatus = pipe_exitstatus_read.read
        #save exit status to DB
        break
    end
    sleep 1
end

NOTE: I cannot have the parent poll the command output pipe because the parent would then be blocked waiting for the pipe to close. It would not be able to pause and resume the command via the control loop.

Upvotes: 4

Views: 4445

Answers (2)

user1118597
user1118597

Reputation: 133

This code seems to do what you want, and may be illustrative.

cmd = "for i in {1..6600}; do echo '123456789'; done"

pipe_cmd_in, pipe_cmd_out = IO.pipe
cmd_pid = Process.spawn(cmd, :out => pipe_cmd_out, :err => pipe_cmd_out)

@exitstatus = :not_done
Thread.new do
  Process.wait(cmd_pid); 
  @exitstatus = $?.exitstatus
end

pipe_cmd_out.close
out = pipe_cmd_in.read;
sleep(0.1) while @exitstatus == :not_done
puts "child: cmd out length = #{out.length}; Exit status: #{@exitstatus}"

In general, sharing data between threads (@exitstatus) requires more care, but it works here because it's only written once, by the thread, after initialization. (It turns out $?.exitstatus can return nil, which is why I initialized it to something else.) The call to sleep() is unlikely to execute even once since the read() just above it won't complete until the spawned process has closed its stdout.

Upvotes: 1

dbenhur
dbenhur

Reputation: 20408

Indeed, your diagnosis is likely correct. You could implement a select and read loop on the pipe while waiting for the process to end, but likely you can get what you want more simply with stdlib Open3::capture2e.

Upvotes: 0

Related Questions