Evan Carroll
Evan Carroll

Reputation: 1

Why is `row.names` preferred over `rownames`?

There are two functions in the R core library.

However the docs for row.names specifies For a data frame, ‘rownames’ and ‘colnames’ eventually call ‘row.names’ and ‘names’ respectively, but the latter are preferred. Why are is row.names preferred? Wouldn't it be easier to just ignore row.names and just call rownames?

Upvotes: 40

Views: 9020

Answers (1)

Gordon Smyth
Gordon Smyth

Reputation: 763

row.names() is an S3 generic function whereas rownames() is a lower level non-generic function. rownames() is in effect the default method for row.names() that is applied to any object in the absence of a more specific method.

If you are operating on a data frame x, then it is more efficient to use row.names(x) because there is a specific row.names() method for data frames. The row.names() method for data frames simply extracts the "row.names" attribute that is already stored in x. By contrast, because of the definition of rownames() and the inter-relationships between the functions, rownames(x) has to extract all the dimension names of x, then drop the column names, then combine with names(x), then drop names(x) again. This process even involves a call to row.names(x) as an intermediate step. This will all usually happen so quickly that you don't notice it, but just extracting the attribute is obviously more efficient.

It would be logical to just use the generic version row.names() all the time, since it always dispatches the appropriate method. There is no practical advantage in using rownames(x) over row.names(x). For object classes that have a defined row.names method, then rownames(x) is wrong because it bypasses that method. For object classes with no defined row.names method, then the two functions are equivalent because row.names(x) simply calls rownames(x).

The reason why both functions exist is historical. rownames() is the older function and was part of the R language before generic functions and methods were introduced. It was intended only for use on matrices, but it will work fine on any data object that has a dimnames attribute. I personally use rownames(x) when x is a matrix and row.names(x) otherwise but, as I have said, one could just as well use row.names(x) all the time.

Upvotes: 41

Related Questions