Reputation: 1402

Remove last argument in shell script (POSIX)

I am currently working on a language that aims to compile to POSIX shell languages and I want to introduce a pop feature. Just like how you can use "shift" to remove the first argument passed to a function:

f() {
  shift
  printf '%s' "$*"
}

f 1 2 3 #=> 2 3

I want some code that when introduced below can remove the last argument.

g() {
  # pop
  printf '%s' "$*"
}

g 1 2 3 #=> 1 2

I am aware of the array method as detailed in (Remove last argument from argument list of shell script (bash)), but I want something portable that will work in at least the following shells: ash, dash, ksh (Unix), bash, and zsh. I also want something reasonably speedy; something that opens external processes/subshells would be too heavy for small argument counts, thought if you have a creative solution I wouldn't mind seeing it regardless (and they can still be used as a fallback for large argument counts). Something as fast as those array methods would be ideal.

Upvotes: 3

Answers (4)

Rui Damas

Reputation: 11

& pure dash compatible... :)

Usage:

test () { echo "$@" ; } ;

with_init test 1 2 3 4 ;  # test will be called with: 1 2 3

Lib:

#!/bin/sh

init_arguments_ () {  #L variable [#arguments] last ;

  local ia_i=0 ia_v="$1" ia_tr= ;
      # index  variable  to_return

  shift ;

  while [ $(( ia_i += 1 )) -lt $# ] ; do

    ia_tr=$ia_tr" \"\$$ia_i\"" ;  done ;

  eval "$ia_v=\$ia_tr" ;
} ;

unshift () {  #L [#arguments ;

  local args= ;  init_arguments_ args "$@" ;

  echo "eval set $args" ;
} ;


unshift_ () {  #L arguments_variable

  eval "$1=\${$1"'% \"*}' ;
} ;

with_init () {
#L command [#arguments] last

  local command=$1 args= ;  shift ;

  # Maybe get last before removal

    # eval 'local last="$'$#\" ;

  # Shorter

    # $(unshift "$@") ;

    # "$command" "$@" ;

  # Faster

  init_arguments_ args "$@" ;

  ## Maybe unshift another, nice for loops

    # unshift_ args ;

  eval "$command$args" ;

    # Or:  eval "set $args" ;
    #      "$command" "$@" ;

} ;

Upvotes: 1

phranz

Reputation: 69

alias pop='set -- $(eval printf '\''%s\\n'\'' $(seq $(expr $# - 1) | sed '\''s/^/\$/;H;$!d;x;s/\n/ /g'\'') )'

EDIT:

this is a POSIX shell solution that use aliases instead of functions; if called in a function, this gives the desired effect (it resets the function arguments by using the same number of arguments minus the last; being an alias, and with eval, it can change the values of the enclosing function):

func () {
    pop
    pop
    echo "$@"
}
func a b c d e      # prints a b c

Upvotes: 2

Richard Tingstad

Reputation: 425

pop () {
    i=0
    while [ $((i+=1)) -lt $# ]; do
        set -- "$@" "$1"
        shift
    done # 1 2 3 -> 3 1 2
    printf '%s' "$1" # last argument
    shift # $@ is now without last argument
}

Upvotes: 2

phicr

Reputation: 1402

This is my current answer:

pop() {
  local n=$(($1 - ${2:-1}))
  if [ -n "$ZSH_VERSION" -o -n "$BASH_VERSION" ]; then
    POP_EXPR='set -- "${@:1:'$n'}"'
  elif [ $n -ge 500 ]; then
    POP_EXPR="set -- $(seq -s " " 1 $n | sed 's/[0-9]\+/"${\0}"/g')"
  else
    local index=0
    local arguments=""
    while [ $index -lt $n ]; do
      index=$((index+1))
      arguments="$arguments \"\${$index}\""
    done
    POP_EXPR="set -- $arguments"
  fi
}

Note that local is not POSIX, but since all major sh shells support it (and specifically the ones I asked for in my question) and not having it can cause serious bugs, I decided to include it in this leading function. But here's a fully compliant POSIX version with obfuscated arguments to reduce the chance of bugs:

pop() {
  __pop_n=$(($1 - ${2:-1}))
  if [ -n "$ZSH_VERSION" -o -n "$BASH_VERSION" ]; then
    POP_EXPR='set -- "${@:1:'$__pop_n'}"'
  elif [ $__pop_n -ge 500 ]; then
    POP_EXPR="set -- $(seq -s " " 1 $__pop_n | sed 's/[0-9]\+/"${\0}"/g')"
  else
    __pop_index=0
    __pop_arguments=""
    while [ $__pop_index -lt $__pop_n ]; do
      __pop_index=$((__pop_index+1))
      __pop_arguments="$__pop_arguments \"\${$__pop_index}\""
    done
    POP_EXPR="set -- $__pop_arguments"
  fi
}

Usage

pop1() {
  pop $#
  eval "$POP_EXPR"
  echo "$@"
}

pop2() {
  pop $# 2
  eval "$POP_EXPR"
  echo "$@"
}

pop1 a b c #=> a b
pop1 $(seq 1 1000) #=> 1 .. 999
pop2 $(seq 1 1000) #=> 1 .. 998

pop_next

Once you've created the POP_EXPR variable with pop, you can use the following function to change it to omit further arguments:

pop_next() {
  if [ -n "$BASH_VERSION" -o -n "$ZSH_VERSION" ]; then
    local np="${POP_EXPR##*:}"
    np="${np%\}*}"
    POP_EXPR="${POP_EXPR%:*}:$((np == 0 ? 0 : np - 1))}\""
    return
  fi
  POP_EXPR="${POP_EXPR% \"*}"
}

pop_next is a much simpler operation than pop in posix shells (though it's slightly more complex than pop on zsh and bash)

It's used like this:

main() {
  pop $#
  pop_next
  eval "$POP_EXPR"
}

main 1 2 3 #=> 1

POP_EXPR and variable scope

Note that if you're not going to be using eval "$POP_EXPR" immediately after pop and pop_next, if you're not careful with scoping some function call inbetween the operations could change the POP_EXPR variable and mess things up. To avoid this, simply put local POP_EXPR at the start of every function that uses pop, if it's available.

f() {
  local POP_EXPR
  pop $#
  g 1 2
  eval "$POP_EXPR"
  printf '%s' "f=$*"
}

g() {
  local POP_EXPR
  pop $#
  eval "$POP_EXPR"
  printf '%s, ' "g=$*"
}

f a b c #=> g=1, f=a b

popgen.sh

This particular function is good enough for my purposes, but I did create a script to generate further optimized functions.

https://gist.github.com/fcard/e26c5a1f7c8b0674c17c7554fb0cd35c#file-popgen-sh

One of the ways to improve performance without using external tools here is to realize that having several small string concatenations is slow, so doing them in batches makes the function considerably faster. calling the script popgen.sh -gN1,N2,N3 creates a pop function that handles the operations in batches of N1, N2, or N3 depending on the argument count. The script also contains other tricks, exemplified and explained below:

$ sh popgen  \
>  -g 10,100 \ # concatenate strings in batches\
>  -w        \ # overwrite current file\
>  -x9       \ # hardcode the result of the first 9 argument counts\
>  -t1000    \ # starting at argument count 1000, use external tools\
>  -p posix  \ # prefix to add to the function name (with a underscore)\
>  -s ''     \ # suffix to add to the function name (with a underscore)\
>  -c        \ # use the command popsh instead of seq/sed as the external tool\
>  -@        \ # on zsh and bash, use the subarray method (checks on runtime)\
>  -+        \ # use bash/zsh extensions (removes runtime check from -@)\
>  -nl       \ # don't use 'local'\
>  -f        \ # use 'function' syntax\
>  -o pop.sh   # output file

An equivalent to the above function can be generated with popgen.sh -t500 -g1 -@. In the gist containing popgen.sh you will find a popsh.c file that can be compiled and used as a specialized, faster alternative to the default shell external tools, it will be used by any function generated with popgen.sh -c ... if it's accessible as popsh by the shell. Alternatively, you can create any function or tool named popsh and use it in its place.

Benchmark

Benchmark functions:

The script I used for benchmarking can be found on this gist: https://gist.github.com/fcard/f4aec7e567da2a8e97962d5d3f025ad4#file-popbench-sh

The benchmark functions are found in these lines: https://gist.github.com/fcard/f4aec7e567da2a8e97962d5d3f025ad4#file-popbench-sh-L233-L301

The script can be used as such:

$ sh popbench.sh   \
>   -s dash        \ # shell used by the benchmark, can be dash/bash/ash/zsh/ksh.\
>   -f posix       \ # function to be tested\
>   -i 10000       \ # number of times that the function will be called per test\
>   -a '\0'        \ # replacement pattern to model arguments by index (uses sed)\
>   -o /dev/stdout \ # where to print the results to (concatenates, defaults to stdout)\
>   -n 5,10,1000     # argument sizes to test

It will output a time -p style sheet with a real, user and sys time values, as well as an int value, for internal, that is calculated inside the benchmark process using date.

Times

The following are the int results of calls to

$ sh popbench.sh -s $shell -f $function -i 10000 -n 1,5,10,100,1000,10000

posix refers to the second and third clauses, subarray refers to the first, while final refers to the whole.

value count           1           5          10         100        1000        10000
---------------------------------------------------------------------------------------
dash/final        0m0.109s    0m0.183s    0m0.275s    0m2.270s   0m16.122s   1m10.239s
ash/final         0m0.104s    0m0.175s    0m0.273s    0m2.337s   0m15.428s   1m11.673s
ksh/final         0m0.409s    0m0.557s    0m0.737s    0m3.558s   0m19.200s   1m40.264s
bash/final        0m0.343s    0m0.414s    0m0.470s    0m1.719s   0m17.508s   3m12.496s
---------------------------------------------------------------------------------------
bash/subarray     0m0.135s    0m0.179s    0m0.224s    0m1.357s   0m18.911s   3m18.007s
dash/posix        0m0.171s    0m0.290s    0m0.447s    0m3.610s   0m17.376s    1m8.852s
ash/posix         0m0.109s    0m0.192s    0m0.285s    0m2.457s   0m14.942s   1m10.062s
ksh/posix         0m0.416s    0m0.581s    0m0.768s    0m4.677s   0m18.790s   1m40.407s
bash/posix        0m0.409s    0m0.739s    0m1.145s   0m10.048s   0m58.449s  40m33.024s

On zsh

For large argument counts setting set -- ... with eval is very slow on zsh no matter no matter the method, save for eval 'set -- "${@:1:$# - 1}"'. Even as simple a modification as changing it to eval "set -- ${@:1:$# - 1}" (ignoring that it doesn't work for arguments with spaces) makes it two orders of magnitude slower.

value count           1           5          10         100        1000        10000
---------------------------------------------------------------------------------------
zsh/subarray      0m0.203s    0m0.227s    0m0.233s    0m0.461s    0m3.643s   0m38.396s
zsh/final         0m0.399s    0m0.416s    0m0.441s    0m0.722s    0m4.205s   0m37.217s
zsh/posix         0m0.718s    0m0.913s    0m1.182s    0m6.200s   0m46.516s  42m27.224s
zsh/eval-zsh      0m0.419s    0m0.353s    0m0.375s    0m0.853s    0m5.771s  32m59.576s

More benchmarks

For more benchmarks, including only using external tools, the c popsh tool or the naive algorithm, see this file:

https://gist.github.com/fcard/f4aec7e567da2a8e97962d5d3f025ad4#file-benchmarks-md

It's generated like this:

$ git clone https://gist.github.com/f4aec7e567da2a8e97962d5d3f025ad4.git popbench
$ cd popbench
$ sh popgen_run.sh
$ sh popbench_run.sh --fast # or without --fast if you have a day to spare
$ sh poptable.sh -g >benchmarks.md

Conclusion

This has been the result of a week-long research on the subject, and I thought I'd share it. Hopefully it's not too long, I tried to trim it to the main information with links to the gist. This was initially made as an answer to (Remove last argument from argument list of shell script (bash)) but I felt the focus on POSIX made it off topic.

All the code in the gists linked here is licensed under the MIT license.

Upvotes: 4