Reputation: 613
I'm trying to get a hold on how the apply function works. Here is what I tried:
df = data.frame(x=c(1,2,3,4,5), x2=c(1,2,3,4,5))
apply(df$x2, 2, function(x) x*2) #doesn't work
apply(df["x2"], 2, function(x) x*2) #works
apply(df[,2], 2, function(x) (x*2)) #doesn't work
apply(df[2], 2, function(x) x*2) #works (suprisingly)
apply(df[2,], 1, function(x) x*2) #works, but gives me vertical vector
apply(df[2,], 2, function(x) x*2) #works; this gives me the output I expected in line above
Questions (as idicated by comments):
Upvotes: 3
Views: 863
Reputation: 7951
Why doesn't line 2 work although line 3 does?
df$x2
is a vector i.e. c(1,2,3,4,5)
whereas df["x2"]
is a data frame with just one column. The vector has no second dimension to apply over. See ?'[']
in R for details of how subsetting works, this isn't really related to the apply function
Why can I use [2,] to refer to row 2 (line 6), but cannot use [,2] to refer to column 2 (line 4), but have to use [2] (line 5) instead?
Again, see the subsetting help page, but df[,2,drop=FALSE]
is probably what you need.
In line 6 I expected to get what I got from line 7: row 2 (with double values) in a row. Why didn't I get this from line 6, I indicated row with MARGIN=2?
The value section of ?apply
explains the dimensions that you can expect as output from a call to apply:
If each call to FUN returns a vector of length n, then apply returns an array of dimension c(n, dim(X)[MARGIN]) if n > 1. If n equals 1, apply returns a vector if MARGIN has length 1 and an array of dimension dim(X)[MARGIN] otherwise.
In this case we see that:
> dim(df[2,])
# [1] 1 2
and so:
apply(df[2,], 1, function(x) x*2)
has n=2
and dim(df[2,])[1]=1
, so you should expect an output with dimensions c(2,1)
.
Upvotes: 1
Reputation: 11500
apply
needs to be used on something with a dimension of positive length. For simplicity some Object that has rows and columns.
That's why you have margin 1, 2
. Standing for the row-wise and col-wise operation.
Check your Input values like this:
dim(df["x2"])
dim(df[,2]) #this is null, so it does not work
df[,2]
gives you a vector same as df$x2
. A vector does not have rows and cols. Therefore not working with apply
.
In order to understand what you are doing wrong:
Type ?"["
into your console and read everything. Also play around... what you are already doing!
Have a closer look at the drop
argument.
Lastly with df[2,]
your subsetting a single row. It's still a dataframe.
Check dim(df[2,])
apply(df[2,], 1, function(x) x*2) #works, but gives me vertical vector
apply(df[2,], 2, function(x) x*2) #works; this gives me the output I expected in line above
The reason you don't get the same output. Is the WHOLE reason why apply
exists. Please read ?apply
to understand.
When you have questions after reading the two mentioned resources, feel free to ask more.
Here is a little example:
m <- matrix(1:9,nrow=3)
m
apply(m,1,max) #row-wise max value
apply(m,2,max) #col-wise max value
Upvotes: 3
Reputation: 6813
The problem is subsetting:
First:
df$x2
and df[, 2] are different from df["x2"]
and df[2], as the former return a numeric
vector, the latter return a data.frame
.
Second:
df[2, ]
returns the second row of your data.frame
. If you use MARGIN = 1
you go through the rows, each row is represented as a (named) vector of length equal to the number of columns in your data.frame
.
If you use MARGIN = 2
you go through the columns, again, each column is represented as a (named) vector of length equal to the number of rows in your data.frame
.
Upvotes: 1
Reputation: 1441
You should look at each type and dimension of the expression
> typeof(df$x2)
[1] "double"
> dim(df$x2)
NULL
> typeof(df["x2"])
[1] "list"
> dim(df["x2"])
[1] 5 1
> typeof(df[, 2])
[1] "double"
> dim(df[, 2])
NULL
> typeof(df[2])
[1] "list
> dim(df[2])
[1] 5 1
> typeof(df[2, ])
[1] "list"
> dim(df[2,])
[1] 1 2
The line 2 does not work because you try to apply function to variable which has NULL dimension. (dim(X) must have positive length
). The rest is similar. You must keep attention on the type of the expression in apply. I recommend you to simply print values to check if there are properly for the apply function.
Upvotes: 0