tbhaxor
tbhaxor

Reputation: 1943

Unable to execute awk command in a function, but working directly in the shell

I want to create a utility function for bash to remove duplicate lines. I am using function

function remove_empty_lines() {
  if ! command -v awk &> /dev/null
  then
    echo '[x] ERR: "awk" command not found'
    return
  fi

  if [[ -z "$1" ]]
  then
    echo "usage: remove_empty_lines <file-name> [--replace]"
    echo
    echo "Arguments:"
    echo -e "\t--replace\t (Optional) If not passed, the result will be redirected to stdout"
    return
  fi

  if [[ ! -f "$1" ]]
  then
    echo "[x] ERR: \"$1\" file not found"
    return
  fi
  echo $0
  local CMD="awk '!seen[$0]++' $1"

  if [[ "$2" = '--reload' ]]
  then
    CMD+=" > $1"
  fi

  echo $CMD
}

If I am running the main awk command directly, it is working. But when i execute the same $CMD in the function, I am getting this error

$ remove_empty_lines app.js
/bin/bash
awk '!x[/bin/bash]++' app.js

Upvotes: 2

Views: 355

Answers (2)

Charles Duffy
Charles Duffy

Reputation: 295363

The original code is broken in several ways:

  • When used with --reload, it would truncate the output file's contents before awk could ever read those contents (see How can I use a file in a command and redirect output to the same file without truncating it?)
  • It didn't ever actually run the command, and for the reasons described in BashFAQ #50, storing a shell command in a string is inherently buggy (one can work around some of those issues with eval; BashFAQ #48 describes why doing so introduces security bugs).
  • It wrote error messages (and other "diagnostic content") to stdout instead of stderr; this means that if your function's output was redirected to a file, you could never see its errors -- they'd end up jumbled into the output.
  • Error cases were handled with a return even in cases where $? would be zero; this means that return itself would return a zero/successful/truthy status, not revealing to the caller that any error had taken place.

Presumably the reason you were storing your output in CMD was to be able to perform a redirection conditionally, but that can be done other ways: Below, we always create a file descriptor out_fd, but point it to either stdout (when called without --reload), or to a temporary file (if called with --reload); if-and-only-if awk succeeds, we then move the temporary file over the output file, thus replacing it as an atomic operation.

remove_empty_lines() {
  local out_fd rc=0 tempfile=
  command -v awk &>/dev/null || { echo '[x] ERR: "awk" command not found' >&2; return 1; }

  if [[ -z "$1" ]]; then
    printf '%b\n' >&2 \
      'usage: remove_empty_lines <file-name> [--replace]' \
      '' \
      'Arguments:' \
      '\t--replace\t(Optional) If not passed, the result will be redirected to stdout'
    return 1
  fi

  [[ -f "$1" ]] || { echo "[x] ERR: \"$1\" file not found" >&2; return 1; }

  if [ "$2" = --reload ]; then
    tempfile=$(mktemp -t "$1.XXXXXX") || return
    exec {out_fd}>"$tempfile" || { rc=$?; rm -f "$tempfile"; return "$rc"; }
  else
    exec {out_fd}>&1
  fi

  awk '!seen[$0]++' <"$1" >&$out_fd || { rc=$?; rm -f "$tempfile"; return "$rc"; }
  exec {out_fd}>&- # close our file descriptor

  if [[ $tempfile ]]; then
    mv -- "$tempfile" "$1" || return
  fi
}

Upvotes: 7

markp-fuso
markp-fuso

Reputation: 34184

First off the output from your function call is not an error but rather the output of two echo commands (echo $0 and echo $CMD).

And as Charles Duffy has pointed out ... at no point is the function actually running the $CMD.


As for the inclusion of /bin/bash in your function's echo output ... the main problem is the reference to $0; by definition $0 is the name of the running process, which in the case of a function is the shell under which the function is being called. Consider the following when run from a bash command prompt:

$ echo $0
-bash

As you can see from your output this generates /bin/bash in your environment. See this and this for more details.

On a related note, the reference to $0 within double quotes causes the $0 to be evaluated, so this:

local CMD="awk '!seen[$0]++' $1"

becomes

local CMD="awk '!seen[/bin/bash]++' app.js"

I'm thinking what you want is something like:

echo $1                               # the name of the file to be processed
local CMD="awk '!seen[\$0]++' $1"     # escape the '$' in '$0'

becomes

local CMD="awk '!seen[$0]++' app.js"

That should fix the issues shown in your function's output; as for the other issues ... you're getting a good bit of feedback in the various comments ...

Upvotes: 1

Related Questions