edoardo pedrini
edoardo pedrini

Reputation: 121

python vs. R scope

I am trying to understand why I get a different output in two different function in R vs the same(?) implementation in python.

python:

    def increment(n):
       n = n + 1
       print(n)

    n = 1
    increment(n)
    print(n)
2
1

    def increment2(x):
       x[0] = x[0] + 1
       print(x)

    n = [1]
    increment2(n)
    print(n)

2
2

R:

increment <- function(n){
  n = n + 1
  print(n) 
}

n = 1
increment(n)
2
print(n)
1

increment2 <- function(n){
  n[1] = n[1] + 1
  print(n)
}

n = c(1)
increment2(n)
2
print(n)
1

In my head it seems more consistent the R output. everything is inside the function and do not get outside (unless I return and assign the output back to n). Can anyone give me a pythonic interpretation of it?

Upvotes: 3

Views: 761

Answers (3)

lebatsnok
lebatsnok

Reputation: 6459

R is heavily influenced by functional languages, most notably Scheme. In functional languages, a "function" is understood just like in mathematics, it does not (and cannot) change its arguments, and its output depends only on arguments (and nothing else).

# pseudocode
let x be 1
tell_me sin(x)   # 0.841
tell_me x   # still 1

It is conceivable that sin(x) would commit a sin (from a functional perspective) and assign a new value to x.

R is not a purely functional language, however.

(1) You can (easily, and sometimes with bad consequences) access objects from within a function.

> rm(jumbo) # if you're not running this for the first time
> mumbo <- function() jumbo
> mumbo()
Error in mumbo() : object 'jumbo' not found
> jumbo <- 1
> mumbo()
[1] 1

[edit] There was an objection in a comment that some objects need to be visible from within a function. That is completely true, for example, one cannot possibly define arithmetical operations in every function. So the definition of + must be accessible ... but the difference is, in some languages you have explicit control over what is accessible and what is not. I'm not a python expert but I guess that's what is meant by

 from jumbo import *

R has packages, which you can attach in a similar way but the difference is that everything in your workspace is, by default, visible from within a function. This may be useful but is also dangerous as you may inadvertently refer to objects that you forgot to define within a function ... and the thing will work in a wrong way, as in the following example:

X <- 1e+10
addone <- function(x) X + 1  # want to increment the argument by 1
addone(3)
# [1] 1e+10  
addone(3)==1e+10+1
# [1] TRUE   

This is avoided in packages, so a function in a package cannot accidentally get values from your global workspace. And if you are so inclined, you can change the environment of your own functions as well. This might be a way to prevent such accidental errors (not necessarily a convenient way, though):

environment(mumbo)  # .GlobalEnv
environment(mumbo) <- baseenv()  # changing the environment
mumbo()  # error: object 'jumbo' not found

[/edit]

(2) You can, if you want to, change outside objects from within a function, for example, with <<- (as opposed to <-):

> increment.n <- function(){
+   n <<- n + 1
+   print(n) 
+ }
> increment.n()
Error in increment.n() : object 'n' not found
> n <- 1
> increment.n()
[1] 2
> n
[1] 2
> 

Upvotes: 2

G. Grothendieck
G. Grothendieck

Reputation: 269664

This can be interpreted in terms of object identity.

A list x in python is like a pointer in that it has an identity independent of its contents so assigning a new value to an element of a list does not change the identity of the list. Changing the contents in the function does not change the list's identity and it seems that a function is free to change the contents.

A vector in R does not have an identity apart from its contents. Changing the contents in the function creates a new vector. The original vector is unchanged. R does have objects which have object identity -- they are called environments.

increment3 <- function(e){
  e$n = e$n + 1
  print(e$n)
}

e <- new.env()
e$n <- 1
increment3(e)
## [1] 2
print(e$n)
## [1] 2

In R, it is also possible to modify a vector in place using external C or C++ code. For example, see https://gist.github.com/ggrothendieck/53811e2769d0582407ae

Upvotes: 4

Matt Messersmith
Matt Messersmith

Reputation: 13747

I can't speak for how R passes parameters, but it's pretty common for programming languages (including Python) to have mutations on mutable objects be reflected outside of the function that performed the mutation. Java, C#, and other popular languages that support OOP (Object Oriented Programming) act this way too.

Lists like [1] are mutable objects, so you see that mutation outside of the function. This type of behavior makes object oriented programming much more convenient.

If this behavior is undesirable, consider using a functional programming style in python (immutable objects, map, filter, reduce) or passing copies of your mutable objects to your functions.

I don't think there's much going on here that has to do with it being pythonic or not. It's a language mechanism: nothing more.

Upvotes: 3

Related Questions