ikegami
ikegami

Reputation: 385575

Generating sh code from within sh: Escaping

I have a shell variable (we shall call x) containing a string with shell meta characters. For example, I might have

abc  "def's"  ghi

as set by

x='abc  "def'\''s"  ghi'

I want to build a shell command from that string (to be stored in a file, not executed). What are my options?

echo "prog $x"     >>file     # Doesn't work
echo "prog '$x'"   >>file     # Doesn't work
echo "prog \"$x\"" >>file     # Doesn't work

The current solution uses sed

y=`echo "$x" | sed 's/\([^a-zA-Z0-9._\-\/]\)/\\\\\1/g'`
echo "prog $y" >>file

The output is as follows (although equivalent output is also acceptable):

prog abc\ \ \"def\'s\"\ \ ghi

The problem is that the number of places this needs to be done is increasing. Does anyone have a better solution?

Notes:

Upvotes: 2

Views: 1328

Answers (2)

ikegami
ikegami

Reputation: 385575

sh has functions.

# to_shell_lit() - Creates a shell literal
# Usage: shell_lit="$( to_shell_lit "..." )"
to_shell_lit() {
   printf \'
   printf %s "$1" | sed "s/'/'\\\\''/g"
   printf \'
}

Testing:

$ x='abc  "def'\''s"  ghi'

$ printf '%s\n' "$x"
abc  "def's"  ghi

$ printf '%s\n' "prog `to_shell_lit "$x"`"
prog 'abc  "def'\''s"  ghi'

$ printf '%s\n' "prog $( to_shell_lit "`pwd`" )"
prog '/home/ikegami/foo bar'
$ printf '%s\n' "$( to_shell_lit "a'b" )"
'a'\''b'

$ printf '%s\n' "$( to_shell_lit '-n' )"
'-n'

$ printf '%s\n' "$( to_shell_lit '\\' )"
'\\'

$ printf '%s\n' "$( to_shell_lit 'foo
bar' )"
'foo
bar'

A version that takes multiple arguments:

# to_shell_lit() - Creates a string of space-separated shell literals
# Usage: shell_lits="$( to_shell_lit "..." "..." "..." )"
to_shell_lit() {
   local prefix=''
   local p
   for p in "$@"; do
      printf "$prefix"\'
      printf %s "$p" | sed "s/'/'\\\\''/g"
      printf \'
      prefix=' '
   done
}

Upvotes: 5

mtraceur
mtraceur

Reputation: 3726

The following solution will work on ALL input strings that the Bourne shell can handle (including newline characters), uses no external commands on most systems, and is portable to all modern Bourne-like shells: Each argument put into this esceval function is printed individually properly escaped/quoted. (I named it esceval for "escape evaluation", in case you were wondering.)

esceval()
{
    case $# in 0) return 0; esac
    while :
    do
        printf "'"
        unescaped=$1
        while :
        do
            case $unescaped in
            *\'*)
                printf %s "${unescaped%%\'*}""'\''"
                unescaped=${unescaped#*\'}
            ;;
            *)
                printf %s "$unescaped"
                break
            ;;
            esac
        done
        shift
        case $# in 0) break; esac
        printf "' "
    done
    printf "'\n"
}

More pedantic elaboration on my guarantees above:

  1. The above code is portable to all shells that have functions (all shells in practical use today) and the ${foo#bar} and ${foo%%bar} substitutions (all shells you need to care about unless you're targetting Solaris 10's /bin/sh or similar ancient relics).
  2. The above code will not need to fork/exec any new processes unless printf is not a builtin in your shell (fairly uncommon for printf to be only available as an external command, vs. for instance sed which is almost always an external command, and I've seen more stripped-down systems which have a printf but don't have a sed, if that matters).

Note: The version posted here leaks one variable into the global namespace (unescaped), but you can easily fix this by either declaring local unescaped if your shell supports that, or by wrapping the body of the function in a subshell (parentheses - they can even replace the curly braces, though that's a bit visually non-obvious if you go that route, and most shells do fork an additional process for a subshell).

On the other hand, if by some chance you do need to support systems which don't have those variable substring substitutions, you can use sed, but you need to be careful to properly escape tricky things like newlines in your strings:

esceval()
{
    case $# in 0) return 0; esac
    while :
    do
        printf "'"
        printf %s "$1" | sed "s/'/'\\\\''/g"
        shift
        case $# in 0) break; esac
        printf "' "
    done
    printf "'\n"
}

Final note on trailing newlines: you should note that Bourne shells strip trailing newlines from command substitutions (most strip all trailing newlines, a few shells strip just one). In other words, be weary of this:

# Literal strings work fine:
esceval 'foo



'

# Quoted variable substitution also works fine:
ln='
'
esceval "foo$ln$ln$ln$ln"

# Breaks - newlines never make it into `esceval`:
esceval "`printf 'foo\n\n\n\n'`"

P.S. There's also this monstrosity I made, which is a polyfill which will select between the previous two versions depending on if your shell seems to be capable of supporting the necessary variable substitution syntax (it looks awful though, because the shell-only version has to be inside an eval-ed string to keep the incompatible shells from barfing when they see it): https://github.com/mentalisttraceur/esceval/blob/master/sh/esceval.sh

Upvotes: 2

Related Questions