Subhashree Behera
Subhashree Behera

Reputation: 23

Why double quote does not work in echo statement inside cmd in awk script?

gawk 'BEGIN { FS="|"; OFS="|" }NR ==1 {print} NR >=2 {cmd1="echo -n "$2" | base64 -w 0";cmd1 | getline d1;close(cmd1); print $1,d1 }' dummy2.txt

input:

id|dummy                                                                           
1|subhashree:1;user=phn                                                             
2|subha:2;user=phn        

                                                                                                            

Expected output:

id|dummy                                       
1|c3ViaGFzaHJlZToxO3VzZXI9cGhuCg==                               
2|c3ViaGE6Mjt1c2VyPXBobgo= 

                                           

output produced by script:

id|dummy                                       
1|subhashree:1                                                                                                                                                                                                                                                                                                                                                   
2|subha:2     

                                                                                                   

I have understood that the double quote around $2 is causing the issue. It does not work hence not encoding the string properly and just stripping off the string after semi colon.Because it does work inside semicolon and gives proper output in terminal.

echo "subhashree:1;user=phn" | base64                                               
c3ViaGFzaHJlZToxO3VzZXI9cGhuCg==                                                         
[root@DERATVIV04 encode]# echo "subha:2;user=phn" | base64                               
c3ViaGE6Mjt1c2VyPXBobgo=        

                                                      

I have tried with different variation with single and double quote inside awk but it does not work.Any help will be highly appreciated.

Thanks a lot in advance.

Upvotes: 1

Views: 571

Answers (3)

Ed Morton
Ed Morton

Reputation: 204498

You already got answers explaining how to use awk for this but you should also consider not using awk for this. The tool to sequence calls to other commands (e.g. bas64) is a shell, not awk. What you're trying to do in terms of calls is:

shell { awk { loop_on_input { shell { base64 } } } }

whereas if you call base64 directly from shell it'd just be:

shell { loop_on_input { base64 } }

Note that the awk command is spawning a new subshell once per line of input while the direct call from shell isn't.

For example:

#!/usr/bin/env bash

file='dummy2.txt'
head -n 1 "$file"
while IFS='|' read -r id dummy; do
    printf '%s|%s\n' "$id" "$(base64 -w 0 <<<"$dummy")"
done < <(tail -n +2 "$file")

Here's the difference in execution speed for an input file that has each of your data lines duplicated 100 times created by awk -v n=100 'NR==1{print; next} {for (i=1;i<=n;i++) print}' dummy2.txt > file100

$ ./tst.sh file100
Awk:

real    0m23.247s
user    0m3.755s
sys     0m10.966s

Shell:

real    0m14.512s
user    0m1.530s
sys     0m4.776s

The above timing was produced by running this command (both awk scripts posted in answers will have about the same timeing so I just picked one at random):

#!/usr/bin/env bash

doawk() {
    local file="$1"
    gawk -v q="'" 'BEGIN {
                 FS=OFS="|"
               }
               NR==1{
                  print;
                  next
               }
               {
                 cmd1="echo -n " q $2 q" | base64 -w 0";
                 print ((cmd1 | getline d1)>0)? $1 OFS d1 : $0;
                 close(cmd1);
               }
               ' "$file"
}

doshell() {
    local file="$1"
    head -n 1 "$file"
    while IFS='|' read -r id dummy; do
        printf '%s|%s\n' "$id" "$(base64 -w 0 <<<"$dummy")"
    done < <(tail -n +2 "$file")
}

# Use 3rd-run timing to eliminate cache-ing as a factor

doawk "$1" >/dev/null
doawk "$1" >/dev/null
echo "Awk:"
time doawk "$1" >/dev/null

echo ""

doshell "$1" >/dev/null
doshell "$1" >/dev/null
echo "Shell:"
time doshell "$1" >/dev/null

Upvotes: 2

Akshay Hegde
Akshay Hegde

Reputation: 16997

Your existing cmd1 producing

echo -n subhashree:1;user=phn | base64 -w 0

                    ^ semicolon is there

So if you execute below would produce

$ echo -n subhashree:1;user=phn | base64 -w 0
subhashree:1

With quotes

$ echo -n 'subhashree:1;user=phn' | base64 -w 0
c3ViaGFzaHJlZToxO3VzZXI9cGhu

Solution is just to use quotes before echo -n '<your-string>' | base64 -w 0

$ cat file 
id|dummy
1|subhashree:1;user=phn
2|subha:2;user=phn

$ gawk -v q="'" 'BEGIN { FS="|"; OFS="|" }NR ==1 {print} NR >=2 {cmd1="echo -n " q $2 q" | base64 -w 0";  cmd1 | getline d1;close(cmd1); print $1,d1 }' file
id|dummy
1|c3ViaGFzaHJlZToxO3VzZXI9cGhu
2|c3ViaGE6Mjt1c2VyPXBobg==

It can be simplified as below

gawk -v q="'" 'BEGIN {
                 FS=OFS="|"
               }
               NR==1{
                  print;
                  next
               }
               {
                 cmd1="echo -n " q $2 q" | base64 -w 0";
                 print ((cmd1 | getline d1)>0)? $1 OFS d1 : $0;
                 close(cmd1);
               }
               ' file

Based on Ed Morton recommendation http://awk.freeshell.org/AllAboutGetline

if/while ( (getline var < file) > 0)
if/while ( (command | getline var) > 0)
if/while ( (command |& getline var) > 0)

Upvotes: 2

Inian
Inian

Reputation: 85865

The problem is because of lack of quotes, when trying to run the echo command in shell context. What you are trying to do is basically converted into

echo -n subhashree:1;user=phn | base64 -w 0

which the shell has executed as two commands separated by ; i.e. user=phn | base64 -w 0 means an assignment followed by a pipeline, which would be empty because the assignment would not produce any result over standard input for base64 for encode. The other segment subhashree:1 is just echoed out, which is stored in your getline variable d1.

The right approach fixing your problem should be using quotes

echo -n "subhashree:1;user=phn" | base64 -w 0

When you said, you were using quotes to $2, that is not actually right, the quotes are actually used in the context of awk to concatenate the cmd string i.e. "echo -n ", $2 and " | base64 -w 0" are just joined together. The proposed double quotes need to be in the context of the shell.

SO with that and few other fixes, your awk command should be below. Added gsub() to remove trailing spaces, which were present in your input shown. Also used printf over echo.

awk -v FS="|" '
    BEGIN {
        OFS = FS
    }
    
    NR == 1 {
        print
    }
    
    NR >= 2 {
        gsub(/[[:space:]]+/, "", $2)
        cmd = "printf \"%s\" \"" $2 "\" | base64 -w 0"
        if ((cmd | getline result) > 0) {
            $2 = result
        }
        close(cmd)
        print
    }    
' file

So with the command above, your command is executed as below, which would produce the right result.

printf "%s" "subhashree:1;user=phn" | base64 -w 0

Upvotes: 2

Related Questions