Reputation: 367
I feel like this question should have already been answered, but I found none. I have an array and I want to subset it using a vector. I know how to do it the hard way, but I'm sure there's got to be an easy way. Any ideas?
Here's my example:
dat <- data.frame(a = rep(letters[1:3], 2), b = rep(letters[1:2], 3), c = c(rep("a", 5), "b"), x = rnorm(6), stringsAsFactors = FALSE)
l <- by(dat[ , "x"], dat[ , 1:3], mean)
l["a", "a", "a"] # works
l[c("a", "a", "a")] # does not work
So I guess I need to a way to remove the c()
wrapper form c("a", "a", "a")
before passing it to l
.
Upvotes: 3
Views: 4237
Reputation: 2353
This has already been answered, but I wanted to make things a little bit more clear. Let's take your example here:
dat <- data.frame(a = rep(letters[1:3], 2), b = rep(letters[1:2], 3), c = c(rep("a",
5), "b"), x = rnorm(6), stringsAsFactors = FALSE)
l <- by(dat[, "x"], dat[, 1:3], mean)
l["a", "a", "a"] # works
## [1] 1.246
l[c("a", "a", "a")] # does not work
## [1] NA NA NA
A previous answer suggested to use matrix(rep("a", 3), nrow=1)
in subsetting.
I want to expand on why this works. First, let's take a look at what the
differences between these two data structures are:
a.mat <- matrix(rep("a", 3), nrow = 1)
a.vec <- c("a", "a", "a") # Note: this is equivalent to rep('a', 3)
a.mat
## [,1] [,2] [,3]
## [1,] "a" "a" "a"
a.vec
## [1] "a" "a" "a"
as.matrix(a.vec)
## [,1]
## [1,] "a"
## [2,] "a"
## [3,] "a"
l[a.mat]
## [1] 1.246
l[a.vec]
## [1] NA NA NA
l[as.matrix(a.vec)]
## [1] NA NA NA
a.mat
and a.vec
look the same when you print them to screen, but they are not
treated in the same way because R creates matrices in Column Major Order in that it writes and reads a matrix column by column. When you use a matrix for subsetting, it will use each column as a different dimension. If the number of columns in the matrix match up with the number of dimensions in the object to be subsetted, it will use each column for each subsequent dimension.
If the number of columns does not match up, R will collapse the matrix into a vector and attempt to match the element indices that way. Here are some more examples:
a.mat[, -1] # Now only two columns
## [1] "a" "a"
l[a.mat[, -1]] # Notice you get NA twice here.
## [1] NA NA
l[matrix(rep("a", 4), nrow = 1)] # Using a matrix with 4 columns.
## [1] NA NA NA NA
As a side note, when you subset with a character vector, R will attempt to match any element names. If they don't exist, you will get an NA
or an error:
# Vector example:
x <- letters
x[1]
## [1] "a"
x["a"]
## [1] NA
names(x) <- letters
x[1]
## a
## "a"
x["a"]
## a
## "a"
x[c("a", "a", "a")]
## a a a
## "a" "a" "a"
x[a.mat] # collapsing matrix down to a vector.
## a a a
## "a" "a" "a"
# Matrix example:
x <- matrix(letters[1:9], nrow = 3, ncol = 3)
x
## [,1] [,2] [,3]
## [1,] "a" "d" "g"
## [2,] "b" "e" "h"
## [3,] "c" "f" "i"
x[c(1, 1)]
## [1] "a" "a"
x[1, 1]
## [1] "a"
x[c("a", "a")]
## [1] NA NA
x["a", "a"]
## Error: no 'dimnames' attribute for array
rownames(x) <- letters[1:3]
colnames(x) <- letters[1:3]
x
## a b c
## a "a" "d" "g"
## b "b" "e" "h"
## c "c" "f" "i"
x[c(1, 1)]
## [1] "a" "a"
x[1, 1]
## [1] "a"
x[c("a", "a")]
## [1] NA NA
x["a", "a"]
## [1] "a"
And finally, if you use a numeric vector, you will always get a defined value (unless it's out of bounds):
l[c(1,1,1)]
## [1] 1.246 1.246 1.246
l[1, 1, 1]
## [1] 1.246
Upvotes: 4
Reputation: 25736
Instead of a vector you could use a matrix:
l[matrix(rep("a", 3), nrow=1)]
Upvotes: 3