ZzZRazerZzZ
ZzZRazerZzZ

Reputation: 3

Base64 utility in awk

I run into a specific issue when try to convert an image binary (with a 0xFFD8 jpeg signature) to a base64 string using awk. It looks to me that I am almost there but the base64 string is truncated and not complete. Since the image binary is large, I am not sure if that causes the issue. The command producing this is below:

#!/bin/bash
awk --field-separator '|' '{ "echo "$mybinaryhere" | xxd -r -p | base64" | getline x print x }' myfile.csv

The output is:

/9j/2wBDAAMCAgMCAgMDAwMEAwMEBQgFBQQEBQoHBwYIDAoMDAsKCwsNDhIQDQ4RDgsLEBYQERMU

Expected output should be similar but much longer because it is a binary image. The $mybinaryhere is just a column variable which holds the full binary image when awk is reading myfile.csv

Thanks

Upvotes: 0

Views: 1619

Answers (2)

tshiono
tshiono

Reputation: 22062

The output of base64 is wrapped at an appropriate column size (76 columns) and each line ends with a newline. The getline function of awk just reads next single line from the standard input and the remaining lines will be discarded.
Then would you please try:

awk --field-separator '|' '{ while ("echo "$mybinaryhere" | xxd -r -p | base64" | getline) print }' myfile.csv

Upvotes: 1

jhnc
jhnc

Reputation: 16819

gawk (and other versions of awk) have a limit on the length of the command string that can be piped into getline.

For example, on an ubuntu box, I get:

bash$ for a in gawk 'busybox awk'; do
  for x in 131059 131060; do
      echo "$a :"
      perl -e 'print "." x '$x',"\n"' |\
      $a '{
        y=$1
        print length($1)+length("echo |md5sum")
        "echo "y"|md5sum" | getline z
        print z
      }'
  done
done
gawk :
131071
3ee42da12241d3e96a1513588bf50daf  -
gawk :
131072

busybox awk :
131071
3ee42da12241d3e96a1513588bf50daf  -
busybox awk :
131072

$

On FreeBSD, I get:

  • awk: 261764
  • nawk: 261763

Upvotes: 0

Related Questions