Reputation: 367

r subset array using vector

I feel like this question should have already been answered, but I found none. I have an array and I want to subset it using a vector. I know how to do it the hard way, but I'm sure there's got to be an easy way. Any ideas?

Here's my example:

dat <- data.frame(a = rep(letters[1:3], 2), b = rep(letters[1:2], 3), c = c(rep("a", 5), "b"), x = rnorm(6), stringsAsFactors = FALSE)

l <- by(dat[ , "x"], dat[ , 1:3], mean)

l["a", "a", "a"] # works  
l[c("a", "a", "a")] # does not work

So I guess I need to a way to remove the c() wrapper form c("a", "a", "a") before passing it to l.

Upvotes: 3

Answers (2)

ZNK

Reputation: 2353

This has already been answered, but I wanted to make things a little bit more clear. Let's take your example here:

dat <- data.frame(a = rep(letters[1:3], 2), b = rep(letters[1:2], 3), c = c(rep("a", 
    5), "b"), x = rnorm(6), stringsAsFactors = FALSE)

l <- by(dat[, "x"], dat[, 1:3], mean)

l["a", "a", "a"]  # works  

## [1] 1.246

l[c("a", "a", "a")]  # does not work

## [1] NA NA NA

A previous answer suggested to use matrix(rep("a", 3), nrow=1) in subsetting. I want to expand on why this works. First, let's take a look at what the differences between these two data structures are:

a.mat <- matrix(rep("a", 3), nrow = 1)
a.vec <- c("a", "a", "a")  # Note: this is equivalent to rep('a', 3)
a.mat

##      [,1] [,2] [,3]
## [1,] "a"  "a"  "a"

a.vec

## [1] "a" "a" "a"

as.matrix(a.vec)

##      [,1]
## [1,] "a" 
## [2,] "a" 
## [3,] "a"

l[a.mat]

## [1] 1.246

l[a.vec]

## [1] NA NA NA

l[as.matrix(a.vec)]

## [1] NA NA NA

a.mat and a.vec look the same when you print them to screen, but they are not treated in the same way because R creates matrices in Column Major Order in that it writes and reads a matrix column by column. When you use a matrix for subsetting, it will use each column as a different dimension. If the number of columns in the matrix match up with the number of dimensions in the object to be subsetted, it will use each column for each subsequent dimension.

If the number of columns does not match up, R will collapse the matrix into a vector and attempt to match the element indices that way. Here are some more examples:

a.mat[, -1]  # Now only two columns

## [1] "a" "a"

l[a.mat[, -1]]  # Notice you get NA twice here.

## [1] NA NA

l[matrix(rep("a", 4), nrow = 1)]  # Using a matrix with 4 columns.

## [1] NA NA NA NA

As a side note, when you subset with a character vector, R will attempt to match any element names. If they don't exist, you will get an NA or an error:

# Vector example:
x <- letters
x[1]

## [1] "a"

x["a"]

## [1] NA

names(x) <- letters
x[1]

##   a 
## "a"

x["a"]

##   a 
## "a"

x[c("a", "a", "a")]

##   a   a   a 
## "a" "a" "a"

x[a.mat]  # collapsing matrix down to a vector.

##   a   a   a 
## "a" "a" "a"
# Matrix example:
x <- matrix(letters[1:9], nrow = 3, ncol = 3)
x

##      [,1] [,2] [,3]
## [1,] "a"  "d"  "g" 
## [2,] "b"  "e"  "h" 
## [3,] "c"  "f"  "i"

x[c(1, 1)]

## [1] "a" "a"

x[1, 1]

## [1] "a"

x[c("a", "a")]

## [1] NA NA

x["a", "a"]

## Error: no 'dimnames' attribute for array

rownames(x) <- letters[1:3]
colnames(x) <- letters[1:3]
x

##   a   b   c  
## a "a" "d" "g"
## b "b" "e" "h"
## c "c" "f" "i"

x[c(1, 1)]

## [1] "a" "a"

x[1, 1]

## [1] "a"

x[c("a", "a")]

## [1] NA NA

x["a", "a"]

## [1] "a"

And finally, if you use a numeric vector, you will always get a defined value (unless it's out of bounds):

l[c(1,1,1)]

## [1] 1.246 1.246 1.246

l[1, 1, 1]

## [1] 1.246

Upvotes: 4

sgibb

Reputation: 25736

Instead of a vector you could use a matrix:

l[matrix(rep("a", 3), nrow=1)]

Upvotes: 3

r subset array using vector

Answers (2)

Related Questions