Lucas Cimon
Lucas Cimon

Reputation: 2043

How to store the output of command in a variable without creating a subshell [Bash <v4]

ksh has a really interesting construct to do this, detailed in this answer: https://stackoverflow.com/a/11172617/636849

Since Bash 4.0, there is a builtin mapfile builtin command that should solve this problem: http://www.gnu.org/software/bash/manual/html_node/Bash-Builtins.html

But strangely, it doesn't seem to work with process substitution:

foo () { echo ${BASH_SUBSHELL}; }
mapfile -t foo_output <(foo) # FAIL: hang forever here
subshell_depth=${foo_output[0]} # should be 0

But how to do this in Bash v3.2 ?

Upvotes: 24

Views: 10560

Answers (6)

Jeremiah Rose
Jeremiah Rose

Reputation: 4122

You could temporarily redirect stdout to a file, then run your command:

e.g

function capture_command_output_without_subshell() {
  local tmpfile=$(mktemp)  # Make a temp file
  exec 3>&1                # Save the current stdout
  exec 1> "$tmpfile"       # Redirect stdout to the temporary file
  eval "$*"                # Run the command in the current shell
  exec 1>&3 3>&-           # Restore stdout and close file descriptor 3
  OUTPUT=$(cat "$tmpfile") # Read the temp file into a variable
  rm "$tmpfile"            # Delete the temp file
}

If you also want realtime output to stdout, you could tee the output to the temp file like this:

function capture_command_output_without_subshell() {
  local tmpfile=$(mktemp)   # Make a temp file
  exec 3>&1                 # Save the current stdout
  exec 1> >(tee "$tmpfile") # Copy stdout to the temporary file
  eval "$*"                 # Run the command in the current shell
  exec 1>&3                 # Restore stdout
  OUTPUT=$(cat "$tmpfile")  # Read the temp file into a variable
  rm "$tmpfile"             # Delete the temp file
}

Upvotes: 0

fozzybear
fozzybear

Reputation: 147

This function assigns a given array var the (multi-line) result of the specified command (with args), via anonymous COPROC, which should avoid creating a sub-shell (albeit creating a co-process, as implied) - or so I thought, until reading this.

Nevertheless, the following examples may still be of use, as an alternative means of setting vars:

coproc_set_array() {
    local -n var="$1" 
    coproc "${@:2}"
    exec {COPROC[1]}>&-
    readarray -t var <&"${COPROC[0]}"
    wait $COPROC_PID
    #return $COPROC_PID # -> could fetch and wait for PID after/outside of function instead
}

Invocation works as follows:

declare -a result
...
coproc_set_array result echo -e "TEST\nTEST2"
printf '%s\n' "${result[@]}"

The function accepts an array var name reference as 1st argument and uses the rest of args as the function and its arguments to be executed.

From the latter, it creates a co-process and executes it, while passing the command's output to the fd connected to the bi-directional pipe of COPROC's stdin, requesting to close it in the the same step (exec {COPROC[1]}>&-).

After that, the fd connected to COPROC's stdout is re-directed to readarray via <&"${COPROC[0]}", which creates an array from the command's result and assigns it to the corresponding named reference arg, finally ensuring the command's completion with wait $COPROC_PID.

When the COPROC invocation has returned, the result array var should contain the command's (multi-line) result.

For single-line result var assignments, the line

    readarray -t var <&"${COPROC[0]}"

in the above code example could be replaced with

    read -r var <&"${COPROC[0]}"

A dynamic approach to invoke named COPROCs might work like this:

coproc_named_set_array() {
    local -n _coprocName="$1" _arrayRef="$2" 
    printf -v _c0 '%q%s' "${!_coprocName}" '[0]'
    printf -v _c1 '%q%s' "${!_coprocName}" '[1]'
    printf -v _arrayVar '%q' "${!_arrayRef}"
    coproc "${!_coprocName}" { "${@:3}"; }
    eval 'exec {'"$_c1}"'>&- && readarray -t "$_arrayVar" <&${'"$_c0"'}'
    local -i _coprocPID=${!_coprocName}_PID
    wait $_coprocPID
}

Invoked, e. g. like this (command might also be a function):

coproc_named_set_array echoCoproc result echo -e "TEST\nTEST2"

printf -v _<VAR> '%q' ... is used, to harden eval (which is required for dynamic evaluation of the fd connector vars to the co-process' bi-directional pipe) against injection attacks. It's not my expertise, though, and might still have issues, so use with caution.

If that isn't a concern, normal local var assignments can be used, of course.

Yet another approach - this function accepts a 'producer' and a 'consumer' argument, each either as plain string, named var, array or function reference, which then are expanded and executed accordingly.

Compound commands are only working, if referenced from functions, otherwise, only simple commands are possible:

coproc_to_cmd() {
    [[ $# -ne 2 ]] && echo -e "FATAL: coproc_to_cmd() missing or illegal args!" && exit 1
    local tmp exitStatus
    for arg; do 
        tmp="$(declare -p $arg 2>/dev/null)"
        if [[ $? -ne 0 || $tmp == "declare --"* ]]; then
            [[ $_producer ]] && local _consumer="$arg" || local _producer="$arg"
        else
            [[ $_producer ]] && local -n _consumer="$arg" || local -n _producer="$arg"
        fi      
    done
    coproc ${_producer[@]} && ${_consumer[@]}<&"$COPROC"
    exitStatus=$?
    #exec {COPROC[1]}>&- # should be unneccessary when using wait
    wait $COPROC_PID
    return $exitStatus
}

Note:

It might be possible, to return the COPROC's PID, fetch it after its invocation and wait for the command's completion at some point further down in the code, for concurrent processing in the background, provided, no other anonymous COPROC using the same PID is invoked in-between.

For that to work, the use of at least one named COPROC would be required, but usually, there's only one running co-process allowed, anyway, so wait $<COPRPOC_NAME>_PID should always be invoked, prior to creating the next co-process.

Whether or not this approach yields any significant advantages to a direct array assignment per readarray with input via process substitution or from a pipe, or per var assignment via sub-shell, may depend on the requirements and needs some testing.

Using COPROC should give better access-control to the co-process' input/output (other than just detaching is with &) and may at least perform better, when used concurrently to the parent process, for expensive operations.

see: COPROC command examples

Upvotes: 1

Digital Trauma
Digital Trauma

Reputation: 15986

Here's another way to do it, which is different enough that it warrants a separate answer. I think this method is subshell-free and bash sub-process free:

ubuntu@ubuntu:~$ bar () { echo "$BASH_SUBSHELL $BASHPID"; }
ubuntu@ubuntu:~$ bar
0 8215
ubuntu@ubuntu:~$ mkfifo /tmp/myfifo
ubuntu@ubuntu:~$ exec 3<> /tmp/myfifo
ubuntu@ubuntu:~$ unlink /tmp/myfifo
ubuntu@ubuntu:~$ bar 1>&3
ubuntu@ubuntu:~$ read -u3 a
ubuntu@ubuntu:~$ echo $a
0 8215
ubuntu@ubuntu:~$ exec 3>&-
ubuntu@ubuntu:~$

The trick here is to use exec to open the FIFO in read-write mode with an FD, which seems to have the side-effect of making the FIFO non-blocking. Then you can redirect your command to the FD without it blocking, then read the FD.

Note that the FIFO will be a limited-size buffer, probably around 4K, so if your command produces more output than this, it will end up blocking again.

Upvotes: 19

yatsek
yatsek

Reputation: 1005

This question comes up very often while looking how to just capture output of any "printing" command into variable. So for anyone looking it's possible (since bash v3.1.0) with:

printf -v VARIABLE_NAME "whatever you need here: %s" $ID

If you tweak your scripts for speed then you can use pattern of setting some global variable at the end of functions instead of just "echoing" it - use this with care, it's sometimes criticized as leading to hard to maintain code.

Upvotes: 6

kenorb
kenorb

Reputation: 166349

The easiest way is to drop the function and pass the variable directly, e.g.:

declare -a foo_output
mapfile -t foo_output <<<${BASH_SUBSHELL}
subshell_depth=${foo_output[0]} # Should be zero.

Otherwise given two items in the function:

foo () { echo "$BASH_SUBSHELL $BASHPID"; }

you can use read (modify IFS as needed) like one of the following commands:

cat < <(foo) | read subshell_depth pid # Two variables.
read -r subshell_depth pid < <(foo) # Two separate variables.
read -a -r foo_arr < <(foo) # One array.

or using readarray/mapfile (Bash >4):

mapfile -t foo_output < <(foo)
readarray -t foo_output < <(foo)

then convert the output back into array:

foo_arr=($foo_output)
subshell_depth=${foo_arr[0]} # should be 0

Upvotes: 2

Digital Trauma
Digital Trauma

Reputation: 15986

Here's what I could come up with - its a bit messy, but foo is run in the top-level shell context and its output is provided in the variable a in the top-level shell context:

#!/bin/bash

foo () { echo ${BASH_SUBSHELL}; }

mkfifo /tmp/fifo{1,2}
{
    # block, then read everything in fifo1 into the buffer array
    i=0
    while IFS='' read -r ln; do
        buf[$((i++))]="$ln"
    done < /tmp/fifo1
    # then write everything in the buffer array to fifo2
    for i in ${!buf[@]}; do
        printf "%s\n" "${buf[$i]}"
    done > /tmp/fifo2
} &

foo > /tmp/fifo1
read a < /tmp/fifo2
echo $a

rm /tmp/fifo{1,2}

This of course assumes two things:

  • fifos are allowed
  • The command group that is doing the buffering is allowed to be put into the background

I tested this to work in these versions:

  • 3.00.15(1)-release (x86_64-redhat-linux-gnu)
  • 3.2.48(1)-release (x86_64-apple-darwin12)
  • 4.2.25(1)-release (x86_64-pc-linux-gnu)

Addendum

I'm not sure the mapfile approach in bash 4.x does what you want, as the process substitution <() creates a whole new bash process (though not a bash subshell within that bash process):

$ bar () { echo "$BASH_SUBSHELL $BASHPID"; }
$ bar
0 2636
$ mapfile -t bar_output < <(bar)
$ echo ${bar_output[0]}
0 60780
$ 

So while $BASH_SUBSHELL is 0 here, it is because it is at the top level of the new shell process 60780 in the process substitution.

Upvotes: 4

Related Questions