Haravikk
Haravikk

Reputation: 3280

Reading available characters only

Okay, so I have a script that's listening for some input from a command, and I'd like to be able to manipulate everything that's arrived so far (if anything), but I'm starting to wonder if this is even possible with read.

Here's an example of what I mean:

#!/bin/bash
long_running_task() {
    i=0
    while [ $i -lt 10 ]; do
        i=$(($i + 1))
        printf '%s' "Task $i "
        sleep 1
    done
}

read_output() {
    status=0
    while [ $status = 0 ]; do
        IFS='' read -rd '' -n 16 text; status=$?
        [ -z "$text" ] && continue

        printf '%s %s' "(${#text})" "$text"
    done
    echo
}

long_running_task | read_output

As you'll see, the read_output function will only print text when there are 16 characters available (or it reaches the end of input), rather than printing what's available.

Unfortunately I have no control over the output of the long running task, so I can't simply choose a different delimiter. Currently the only way that I can work around this (that I know of) is to have read fetch only a single character at a time, but this is horribly inefficient.

Is there any way to have read, or something else, fetch as much input as it can then return it, but still offer a means to detect whether the end of the input was reached (as opposed to there simply being no output to capture just yet)?

The environment I'm currently working in is bash so I'll accept any bash-specific solutions, but if there are any portable options then I'd love to see those as well. It also doesn't matter how quickly the command returns, provided it does so in a reasonable amount of time (so I can query output every few seconds).

[edit] Since there's some confusion about the issue, I'll try to give a more functional example:

#!/bin/bash
long_running_task() {
    i=0
    while [ $i -lt 10 ]; do
        i=$(($i + 1))
        printf '%s' "Waiting $i second(s)… "
        sleep $i
        printf '%s\n' 'done.'
    done
}

NL=$'\n'
log() {
    status=0
    while [ $status = 0 ]; do
        IFS='' read -rd '' -n 16 text; status=$?
        [ -z "$text" ] && continue

        printf '%s' "${text/$NL/$NL[$(date +%R)] }"
    done
}

long_running_task | log

Still a bit simplistic, but as you can see it takes output from the long running task and processes new-lines adds a time-stamp at the start of lines that need them; it's not perfect but it hopefully gives the basic idea. This isn't what I actually want to do, but I do need to process new-lines and unfortunately this seems to be the only way to do it, but with read forcing me to wait it's not responsive enough.

Upvotes: 0

Views: 82

Answers (2)

Haravikk
Haravikk

Reputation: 3280

Unfortunately I've had to settle for a compromise; although I want to avoid reading single characters it seems to be the only option to avoid read suddenly blocking data that's arrived.

So my solution looks like the following:

log() {
    status=0; new_line=1
    while [ $status = 0 ]; do
        IFS='' read -rn 1 char; status=$?
        if [ -z "$char" ]; then
             [ $status = 0 ] && { printf '\n'; new_line=1; }
             continue
        fi

        [ $new_line = 1 ] && { printf '%s' "[$(date +%R)] "; new_line=0; }
        printf '%s' "$char"
    done
}

This is for the pure shell version, and actually should work in most shell environments (not just bash, unless I've missed something). Of course for outputting to stdout it's fine, but if this is to be adapted for output to a file then it should either be piped through tee or the printf statements need to be directed to a file-descriptor, otherwise it is incredibly slow (since it'll re-open the file for every character).

Of course other languages like python and perl may be better solutions overall, and I'll use them where available, but this will need to be my fallback. I think I'm also going to make its usage optional, so that it is only used for scripts that expect partial input from some of the programs they run.

Upvotes: 0

Alexandre Halm
Alexandre Halm

Reputation: 989

@Haravikk: your question relates closely to mine here on SO

Unfortunately many experiments and the only answer I got from @chepner lead me to think that there is no easy way to make read do what it wants to do. It seems that read can only swallow chunks of text in three ways:

  • after it gets a newline (or another custom separator, at least in zsh)
  • after a predefined number of chars
  • after a predefined timeout

PS: it makes sense that read wouldn't take whatever arrived so far: imagine that your read_output function does string substitution, e.g. sed "s/Hello/World/g", and that your task yields Hel and after a long time lo. Reading the output in chunks would miss the substitution, which would probably be contrary to the purpose of read_output ; catching the substitution would imply to take chunks as they come (based on some time-based or number-of-chars-based algorithm) but also concatenating and parsing previous chunks, in which case you'd probably need a proper parser rather than plain read ...

Upvotes: 1

Related Questions