R - how to prevent row.names when selecting rows from a data frame

Question

Suppose I create a dataframe (just to keep it simple):

testframe <- data.frame( a = c(1,2,3,4), b = c(5,6,7,8))

Thus, I have two variables (columns) and four cases (rows).

If I select some of the rows BEGINNING WITH THE FIRST row, i get some kind of subset of the dataframe, e.g.:

testframe2 <- testframe[1:2,] #selecting the first two rows

But if i do the same with a row NOT BEGINNING WITH THE FIRST ROW, I get another column containing the row numbers of the original dataframe.

testframe3 <- testframe[3:4,] #selecting the last two rows

leads to:

  a b
3 3 7
4 4 8

What can I do to prevent the new row.names variable in the first place? I know that I can delete it afterwards but maybe it is still possible to avoid it from the beginning.

Thanks for your help!

Simon O&#39;Hanlon · Accepted Answer

It copies the row.names from the original dataset. Just rename the rows using rownames<- like this...

rownames( testframe3 ) <- seq_len( nrow( testframe3 ) )
#   a b
# 1 3 7
# 2 4 8

Programmatically seq_len( nrow( x ) ) is preferred to say 1:nrow( x ) because looks what happens in edge cases where you select a data.frame of zero rows...

df <- testframe[0,]
# [1] a b
# <0 rows> (or 0-length row.names)
rownames(df) <- seq_len( nrow( df ) ) #  No error thrown - returns a length 0 vector of rownames

#  But...
rownames(df) <- 1:nrow( df )
# Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
#   invalid 'row.names' length

#  Because...
1:nrow( df )
# [1] 1 0

Alternatively you can do it in one by wrapping the subset in a call to data.frame but this is really inefficient if you want to derive the number of rows programmatically (because you will have to subset twice) and I don't recommend it over the rownames<- method:

data.frame( testframe[3:4,] , row.names = 1:2 )
#  a b
#1 3 7
#2 4 8

R - how to prevent row.names when selecting rows from a data frame

Answers (1)

Related Questions