Reputation: 1372
Take the following example:
boltzmann <- function(x, t=0.1) { exp(x/t) / sum(exp(x/t)) }
z=rnorm(10,mean=1,sd=0.5)
exp(z[1]/t)/sum(exp(z/t))
[1] 0.0006599707
boltzmann(z)[1]
[1] 0.0006599707
It appears that exp
in the boltzmann
function operates over elements and vectors and knows when to do the right thing. Is the sum
"unrolling" the input vector and applying the expression on the values? Can someone explain how this works in R?
Edit: Thank you for all of the comments, clarification, and patience with an R n00b. In summary, the reason this works was not immediately obvious to me coming from other languages. Take python for example. You would first compute the sum and then compute the value for each element in the vector.
denom = sum([exp(v / t) for v in x])
vals = [exp(v / t) / denom for v in x]
Whereas is R the sum(exp(x/t))
can be computed inline.
Upvotes: 1
Views: 218
Reputation: 78640
This might be clearer if you evaluated the numerator and the denominator separately:
x = rnorm(10,mean=1,sd=0.5)
t = .1
exp(x/t)
# [1] 1.845179e+05 6.679273e+03 4.379369e+06 1.852623e+06 9.960374e+02
# [6] 1.359676e+09 6.154045e+03 1.777027e+01 1.070003e+04 6.217397e+04
sum(exp(x/t))
# [1] 2984044296
Since the numerator is a vector of length 10, and the denominator is a vector of length 1, the division returns a vector of length 10.
Since you're interested in comparing this to Python, imagine the two following rules were added to Python (incidentally, these are similar to the usage of arrays in numpy
):
If you divide a list by a number, it will divide all items in the list by the number:
[2, 4, 6, 8] / 2
# [1, 2, 3, 4]
The function exp
in Python is "vectorized", which means that when it is applied to a list it will apply to each item in the list. However, sum still works the way you expect it to.
exp([1, 2, 3]) => [exp(1), exp(2), exp(3)]
In that case, imagine how this code would be evaluated in Python:
t = .1
x = [1, 2, 3, 4]
exp(x/t) / sum(exp(x/t))
It would follow the following simplifications, using those two simple rules:
exp([v / t for v in x]) / sum(exp([v / t for v in x]))
[exp(v / t) for v in x] / sum([exp(v / t) for v in x])
Now do you see how it knows the difference?
Upvotes: 3
Reputation: 121177
Vectorisation has several slightly different meanings in R.
It can mean accepting a vector input, transforming each element, and returning a vector (like exp
does).
It can also mean accepting a vector input and calculating some summary statistic, then returning a scalar value (like mean
does).
sum
conforms to the second behaviour, but also has a third vectorisation behaviour, where it will create a summary statistic across inputs. Try sum(1, 2:3, 4:6)
, for example.
Upvotes: 1
Reputation: 176728
This is explained in An Introduction to R, Section 2.2: Vector arithmetic.
Vectors can be used in arithmetic expressions, in which case the operations are performed element by element. Vectors occurring in the same expression need not all be of the same length. If they are not, the value of the expression is a vector with the same length as the longest vector which occurs in the expression. Shorter vectors in the expression are recycled as often as need be (perhaps fractionally) until they match the length of the longest vector. In particular a constant is simply repeated. So with the above assignments the command
x <- c(10.4, 5.6, 3.1, 6.4, 21.7) y <- c(x, 0, x) v <- 2*x + y + 1
generates a new vector v of length 11 constructed by adding together, element by element, 2*x repeated 2.2 times, y repeated just once, and 1 repeated 11 times.
Upvotes: 4