Alexander Tsepkov
Alexander Tsepkov

Reputation: 4186

awk complains about non-terminated string in command from concatenated strings

So the background is that I'm using date command inside awk. This command has differing flags on Linux/GNU from OSX. I'm storing correct command with flags in $date variable to work around this. The following awk command (that relies on string concatenation) fails:

awk '{
    cmd = "'$date'" substr( $1, 1, length($1) - 3 ) " +\"%Y-%m-%d %H:%M\""
    if ( (cmd | getline dd) > 0 ) {
        $1 = dd
    }
    close(cmd)
    print
}'

with error:

awk: non-terminated string date... at source line 2
 context is
         >>>  <<<
awk: giving up
 source line number 3

When replacing awk with echo, the command outputs correctly:

{
    cmd = "date -r " substr( $1, 1, length($1) - 3 ) " +\"%Y-%m-%d %H:%M\""
    if ( (cmd | getline dd) > 0 ) {
        $1 = dd
    }
    close(cmd)
    print
}

When the above script is put into awk directly, it also parses dates correctly (it takes first argument from each line of stdin as timestamp, strips microseconds and converts date to human-readable format).

$date variable is populated as follows:

date="date -d @"
date -d @1550000000 &>/dev/null
if [ $? -eq 1 ]; then
    date="date -r "
fi

Upvotes: 2

Views: 880

Answers (2)

Steven Lu
Steven Lu

Reputation: 43457

Other answer, which was helpful to me to unravel the mystery, is around adjusting invocation method of awk and sidestepping the shell scripting question of shelling out to awk in this fashion.

I think I solved your shell script syntax problem. The setup:

args.sh:

#!/bin/bash

# copypasta code that shoves $1, $2... into 0-indexed bash array and prints it out.
# store arguments in a special array
args=("$@")
# get number of elements
ELEMENTS=${#args[@]}

# echo each element in array
# for loop
for (( i=0;i<$ELEMENTS;i++)); do
    echo ARGS[${i}]: ${args[${i}]}
done

test.sh:

date="date -r "
./args.sh '{
    cmd = "'$date'" substr( $1, 1, length($1) - 3 ) " +\"%Y-%m-%d %H:%M\""
    if ( (cmd | getline dd) > 0 ) {
        $1 = dd
    }
    close(cmd)
    print
}'

Execution:

❯ ./args.sh one two three                                                                                                                                                                                                                                                                  
ARGS[0]: one
ARGS[1]: two
ARGS[2]: three

❯ bash test.sh          
ARGS[0]: { cmd = "date
ARGS[1]: -r
ARGS[2]: " substr( $1, 1, length($1) - 3 ) " +\"%Y-%m-%d %H:%M\"" if ( (cmd | getline dd) > 0 ) { $1 = dd } close(cmd) print }

Explanation: spaces within your naively expanded un-doublequoted shell variable cause the test using echo to not reveal the actual root problem of awk receiving 3 args instead of 1 arg as expected. The first arg being the malformed incomplete awk program.

Here is my fix: I added doublequotes. The shell command looks pretty gnarly now with a great deal of quoting involved.

❯ cat test.sh     
date="date -r "
./args.sh '{
    cmd = "'"$date"'" substr( $1, 1, length($1) - 3 ) " +\"%Y-%m-%d %H:%M\""
    if ( (cmd | getline dd) > 0 ) {
        $1 = dd
    }
    close(cmd)
    print
}'
❯ bash test.sh   
ARGS[0]: { cmd = "date -r " substr( $1, 1, length($1) - 3 ) " +\"%Y-%m-%d %H:%M\"" if ( (cmd | getline dd) > 0 ) { $1 = dd } close(cmd) print }

I will not comment on awk usage because I don't know how to use awk.

This type of code will be rather brittle but hey at least we don't have large backslash stacks in it yet. Anyone written a quine lately?

Upvotes: 1

anubhava
anubhava

Reputation: 785406

You should always use -v name=value syntax to pass shell variables to awk.

So in your case:

dt="date -r"

awk -v dt="$dt" '{
   cmd = dt substr( $1, 1, length($1) - 3 ) " +\"%Y-%m-%d %H:%M\""
   if ( (cmd | getline dd) > 0 ) {
       $1 = dd
   }
   close(cmd)
   print
}'

More on: How do I use shell variables in awk scripts?

Also note helpful comment by Ed below that awk index starts at 1 instead of 0 in other languages such as C/C++.

Upvotes: 1

Related Questions