Reputation: 121
I am trying to understand why I get a different output in two different function in R vs the same(?) implementation in python.
python:
def increment(n):
n = n + 1
print(n)
n = 1
increment(n)
print(n)
2
1
def increment2(x):
x[0] = x[0] + 1
print(x)
n = [1]
increment2(n)
print(n)
2
2
R:
increment <- function(n){
n = n + 1
print(n)
}
n = 1
increment(n)
2
print(n)
1
increment2 <- function(n){
n[1] = n[1] + 1
print(n)
}
n = c(1)
increment2(n)
2
print(n)
1
In my head it seems more consistent the R output. everything is inside the function and do not get outside (unless I return and assign the output back to n). Can anyone give me a pythonic interpretation of it?
Upvotes: 3
Views: 761
Reputation: 6459
R is heavily influenced by functional languages, most notably Scheme. In functional languages, a "function" is understood just like in mathematics, it does not (and cannot) change its arguments, and its output depends only on arguments (and nothing else).
# pseudocode
let x be 1
tell_me sin(x) # 0.841
tell_me x # still 1
It is conceivable that sin(x)
would commit a sin (from a functional perspective) and assign a new value to x
.
R is not a purely functional language, however.
(1) You can (easily, and sometimes with bad consequences) access objects from within a function.
> rm(jumbo) # if you're not running this for the first time
> mumbo <- function() jumbo
> mumbo()
Error in mumbo() : object 'jumbo' not found
> jumbo <- 1
> mumbo()
[1] 1
[edit] There was an objection in a comment that some objects need to be visible from within a function. That is completely true, for example, one cannot possibly define arithmetical operations in every function. So the definition of +
must be accessible ... but the difference is, in some languages you have explicit control over what is accessible and what is not. I'm not a python expert but I guess that's what is meant by
from jumbo import *
R has packages, which you can attach in a similar way but the difference is that everything in your workspace is, by default, visible from within a function. This may be useful but is also dangerous as you may inadvertently refer to objects that you forgot to define within a function ... and the thing will work in a wrong way, as in the following example:
X <- 1e+10
addone <- function(x) X + 1 # want to increment the argument by 1
addone(3)
# [1] 1e+10
addone(3)==1e+10+1
# [1] TRUE
This is avoided in packages, so a function in a package cannot accidentally get values from your global workspace. And if you are so inclined, you can change the environment of your own functions as well. This might be a way to prevent such accidental errors (not necessarily a convenient way, though):
environment(mumbo) # .GlobalEnv
environment(mumbo) <- baseenv() # changing the environment
mumbo() # error: object 'jumbo' not found
[/edit]
(2) You can, if you want to, change outside objects from within a function, for example, with <<-
(as opposed to <-
):
> increment.n <- function(){
+ n <<- n + 1
+ print(n)
+ }
> increment.n()
Error in increment.n() : object 'n' not found
> n <- 1
> increment.n()
[1] 2
> n
[1] 2
>
Upvotes: 2
Reputation: 269664
This can be interpreted in terms of object identity.
A list x
in python is like a pointer in that it has an identity independent of its contents so assigning a new value to an element of a list does not change the identity of the list. Changing the contents in the function does not change the list's identity and it seems that a function is free to change the contents.
A vector in R does not have an identity apart from its contents. Changing the contents in the function creates a new vector. The original vector is unchanged. R does have objects which have object identity -- they are called environments.
increment3 <- function(e){
e$n = e$n + 1
print(e$n)
}
e <- new.env()
e$n <- 1
increment3(e)
## [1] 2
print(e$n)
## [1] 2
In R, it is also possible to modify a vector in place using external C or C++ code. For example, see https://gist.github.com/ggrothendieck/53811e2769d0582407ae
Upvotes: 4
Reputation: 13747
I can't speak for how R passes parameters, but it's pretty common for programming languages (including Python) to have mutations on mutable objects be reflected outside of the function that performed the mutation. Java, C#, and other popular languages that support OOP (Object Oriented Programming) act this way too.
Lists like [1]
are mutable objects, so you see that mutation outside of the function. This type of behavior makes object oriented programming much more convenient.
If this behavior is undesirable, consider using a functional programming style in python (immutable objects, map
, filter
, reduce
) or passing copies of your mutable objects to your functions.
I don't think there's much going on here that has to do with it being pythonic or not. It's a language mechanism: nothing more.
Upvotes: 3