nbenn
nbenn

Reputation: 691

R single element subsetting (aka `[[`) using a matrix

Assuming, I'm understanding the documentation of [[ correctly, a matrix can be used to subset a data.frame:

A third form of indexing is via a numeric matrix with the one column for each dimension: each row of the index matrix then selects a single element of the array, and the result is a vector. Negative indices are not allowed in the index matrix. NA and zero values are allowed: rows of an index matrix containing a zero are ignored, whereas rows containing an NA produce an NA in the result.

While this works for [, I'm struggling to understand how to do this with [[.

mtcars[1:6, 1:6]
#>                    mpg cyl disp  hp drat    wt
#> Mazda RX4         21.0   6  160 110 3.90 2.620
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875
#> Datsun 710        22.8   4  108  93 3.85 2.320
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440
#> Valiant           18.1   6  225 105 2.76 3.460
(ind <- matrix(1:6, ncol = 2))
#>      [,1] [,2]
#> [1,]    1    4
#> [2,]    2    5
#> [3,]    3    6
mtcars[ind]
#> [1] 110.00   3.90   2.32
mtcars[[ind]]
#> Error in as.matrix(x)[[i]]: attempt to select more than one element in vectorIndex

Is this a bug? Or am I misinterpreting the documentation?

Here is the source of [[.data.frame (v3.6.1)

function (x, ..., exact = TRUE)
{
    na <- nargs() - !missing(exact)
    if (!all(names(sys.call()) %in% c("", "exact")))
        warning("named arguments other than 'exact' are discouraged")
    if (na < 3L)
        (function(x, i, exact) if (is.matrix(i))
            as.matrix(x)[[i]]
        else .subset2(x, i, exact = exact))(x, ..., exact = exact)
    else {
        col <- .subset2(x, ..2, exact = exact)
        i <- if (is.character(..1))
            pmatch(..1, row.names(x), duplicates.ok = TRUE)
        else ..1
        col[[i, exact = exact]]
    }
}

Upvotes: 2

Views: 629

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145965

The doc page (?Extract) you reference says that arrays can be indexed by matrices. Implicitly, I take that to mean non-arrays cannot be indexed by matrices. Data frames are not arrays, so they cannot be indexed by matrices. (Matrices are arrays, of course.)


I do think you're misinterpreting the documentation. You're looking at a documentation page that jointly documents [, [[, and $, together. In the argument description, it says

When indexing arrays by [ a single argument i can be a matrix with as many columns as there are dimensions of x...

The section you quote at the top of your question comes later on, under the heading Matrices and Arrays, which I take to be a section about subsetting matrices and arrays, not about using matrices as indices. (Look at the rest of the section, and the sections before and after, and I think you'll agree with me.)

Nowhere on that documentation page does it talk about using matrices as indices for [[.

I'm surprised it's handled specially in the [[ code you show - but near as I can tell, a matrix given to [[.data.frame will error out unless it's a 1x1 matrix, in which case the data frame is treated as a matrix and the single element is returned, for some arcane reason (probably "compatability with S", though I've no good guess as to why S would allow it).

Upvotes: 1

Related Questions