terr
terr

Reputation: 51

indirect references in bash and ksh

I have a problem with indirection both in bash and ksh. The example below is for ksh. It uses a nameref (typeset -n) but it doesn't work as I expected. func_a passes the name of an array to func_b so that it can be modified (in this simple case func_b adds a second entry to the array). This apparently does not work because the second local var defined in func_b happens to have the same name of the one the nameref var2 refers to (the array var1 from func_a). But one of the reasons for having a native nameref type (as opposed to the various eval hacks one would use in bash) shouldn't it be not to have to deal with these dynamic scoping problems where a function (func_b in this case) works as intended for some caller functions and not for others just because of names of locally defined variables? It seems that a nameref var is basically just an alias or a macro instead of a safe way to connect two strictly separated scopes. I had the same exact problem with bash and I hoped that ksh would implement indirect reference like in C (well, not like low level pointers of course, for security reasons, but at least with a similar "scope isolation" so to speak). Am I missing something?

func_b ()
{
  typeset -n var2=$1
  typeset -i var1

  var2[1]=b
}


func_a ()
{
 typeset -a var1=( a )
 func_b var1
 echo "${var1[@]}"
}

EDIT by Ed Morton: It took me a while to figure out what this question was about so here it is - the OP is concerned about the difference between the output of these 2 scripts depending on whether or not local var1 is declared in func_b():

Script 1:

$ cat tst.sh
#!/usr/bin/env bash

func_b ()
{
    local -n var2=$1
    local var1

    var2=99999
}


func_a ()
{
    local var1=3
    func_b var1
    echo "NOTE still original contents: $var1"
}

func_a

$ ./tst.sh
NOTE still original contents: 3

Script 2:

$ cat tst.sh
#!/usr/bin/env bash

func_b ()
{
    local -n var2=$1
    #local var1      # << Note: now commented out

    var2=99999
}


func_a ()
{
    local var1=3
    func_b var1
    echo "NOTE now modified contents: $var1"
}

func_a

$ ./tst.sh
NOTE now modified contents: 99999

Upvotes: 1

Views: 983

Answers (3)

weibeld
weibeld

Reputation: 15302

The core of the issue is that nameref expansion basically is just macro expansion on the code level without any deeper mechanisms.

From the Bash manual:

Whenever the nameref variable is referenced, assigned to, unset, or has its attributes modified (other than using or changing the nameref attribute itself), the operation is actually performed on the variable specified by the nameref variable’s value.

That is, if ref is a nameref variable and its value is x, then $ref is essentially equivalent to $x and evaluates to whatever $x evaluates to at that point in the script. And these may be different things at different points in time.

It can be best seen in a code example like this:

x=foo
myfunc() {
  local -n ref=x
  echo "$ref = $x"  # Outputs "foo = foo"
  local x="bar"
  echo "$ref = $x"  # Outputs "bar = bar"
}
myfunc

The two echo statements can actually be thought of as reading:

echo "$x = $x"

And in the first one, x refers to the global variable x, and in the second one, x refers to the local variable x that just happens to be declared between the two echo statements.

The crux is that a statement like:

declare -n ref=x

does not create any fixed tying between ref and the variable referred to by x at the moment of the execution of this statement—but it just assigns the string x to the variable ref. The significance of this is then determined whenever ref is used, that is, for example, each time that $ref is evaluated.

As a consequence, if you allow passing variable names to a function with nameref, you have to think about it like passing a global variable to this function and have to make sure that it's not shadowed by a local variable. The complication is that you don't know how this global variable is named, which makes the prevention of shadowing more difficult. A possible approach might be a naming convention to prefix local variables with __ in functions that make use of nameref, and make sure that any variable names that are passed to nameref functions never start with __.

Upvotes: 0

terr
terr

Reputation: 51

I found a solution, well this guy did http://fvue.nl/wiki/Bash:_Passing_variables_by_reference so full credit to him. I don't like the way he's using his findings about bash's unset but anyone can use this property their own ways. So here's the gist of it:

The code above would be like this in bash

func_b ()
{
  local var2=$1
  local -i var1
#do some work to compute the value b
#....
#....
#And in the end assign it with the indirect reference
  eval "$var2[1]=b"
}


func_a ()
{
 local -a var1=( a )
 func_b var1
 echo "${var1[@]}"
}

(the eval use could be avoided but let's stay on point) The problem is obviously that local var1 in func_b shadows the var1 in func_a referred by var2 in func_b. So func_b does behave as intended, i.e. a second entry in the caller's array is added by indirect reference, only when the caller doesn't name its array "var1". Let's say that after the "do some work" part I know that I'm done with the local variable var1 I use in func_b (probably used to compute the wanted value b). At this point I could do this

func_b ()
{
  local var2=$1
  local -i var1
#do some work to compute the value b
#....
#....
#And in the end assign it with the indirect reference
  unset var1 
  eval "$var2[1]=b"
}

to remove the "shadow" on func_a's var1 and terminate the computation correctly. BUT bash unset does not allow that. Once I've set loval var1 in func_b even if at some point I unset it, it will still shadow func_a's var1. What the guy above found out is that unset can in fact reach through the call stack and unset func_b's var1 but only when invoked from a function f's call above func_b's call in the stack IF function f doesn't not define it's own var1 locally. Basically if you do this

func_the_unshadower ()
{
  unset -v var1
}

func_b ()
{
  local var2=$1
  local -i var1
#do some work to compute the value b
#....
#....
#And in the end assign it with the indirect reference
  func_the_unshadower
  eval "$var2[1]=b"
}

func_a ()
{
 local -a var1=( a )
 func_b var1
 echo "${var1[@]}"
}

it works.... Obviously this is just a toy example and everyone can figure out their preferred way to use this property of unset. A simple one is to check at runtime if the variable referred by name is shadowed by some local var by invoking "local" without parameters (which returns the list of local vars). But the great thing is that this is not a bug in bash. In the link above there's even a link to a thread in the bash mailing list where the main bash developer says that this is the way unset is intended to behave and it will stay that way.


EDIT by Ed Morton: So, given the above, to fix the example I added at the bottom of the question we could do:

$ cat tst.sh
#!/usr/bin/env bash

unset_vars() {
    unset -v "$@"
}

func_b ()
{
    local -n var2=$1
    local var1

    unset_vars var1
    var2=99999
}

func_a ()
{
    local var1=3
    func_b var1
    echo "NOTE now modified contents: $var1"
}

func_a

$ ./tst.sh
NOTE now modified contents: 99999

Upvotes: 1

Pete
Pete

Reputation: 143

I found that declaring the functions with the syntax:

function func_a

that it works. This is because this ksh93 syntax makes the typeset declared variables local, whereas with the original syntax POSIX rules apply and the variables are global.

Pete

Upvotes: 1

Related Questions