Zora Sinay
Zora Sinay

Reputation: 43

Why isn't rownames() working on my dataframe in R?

I want to create a vector with rownames of certain rows of my dataframe, but I keep failing and I feel there is something obvious I am missing. My dataframe is extremely large but I have created an example that gives me the exact same problem.

resmakeup <- data.frame("example" = c(4, -3, 2, 1), 
                         row.names = c("number1", "number2", "number3", "number4")
                        )
selection <- rownames(resmakeup[abs(resmakeup$example) >= 2,])

So, if my table looks like this:

        example
number1       4
number2      -3
number3       2
number4       1

I want the "selection" vector to contain number1, number2 and number 3, but that is not working. Instead, I get an empty vector. I checked whether the dataframe had rownames with has_rownames() and that was true. In addition, I checked whether my selection resmakeup[abs(resmakeup$example) >= 2,] works, and it does.

What am I doing wrong and how do I fix it?

sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=Dutch_Netherlands.1252  LC_CTYPE=Dutch_Netherlands.1252    LC_MONETARY=Dutch_Netherlands.1252
[4] LC_NUMERIC=C                       LC_TIME=Dutch_Netherlands.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] writexl_1.3.1               forcats_0.5.0               stringr_1.4.0               purrr_0.3.4                
 [5] readr_1.4.0                 tidyr_1.1.2                 tibble_3.0.4                tidyverse_1.3.0            
 [9] RColorBrewer_1.1-2          readxl_1.3.1                pheatmap_1.0.12             ggthemes_4.2.4             
[13] ggrepel_0.9.1               ggplot2_3.3.3               GEOquery_2.58.0             edgeR_3.32.1               
[17] limma_3.46.0                dplyr_1.0.2                 DESeq2_1.30.0               SummarizedExperiment_1.20.0
[21] Biobase_2.50.0              MatrixGenerics_1.2.0        matrixStats_0.57.0          GenomicRanges_1.42.0       
[25] GenomeInfoDb_1.26.2         IRanges_2.24.1              S4Vectors_0.28.1            BiocGenerics_0.36.0        
[29] ashr_2.2-47                

loaded via a namespace (and not attached):
 [1] fs_1.5.0               bitops_1.0-6           lubridate_1.7.9.2      bit64_4.0.5            httr_1.4.2            
 [6] tools_4.0.2            backports_1.2.1        R6_2.5.0               irlba_2.3.3            DBI_1.1.1             
[11] colorspace_2.0-0       withr_2.4.0            tidyselect_1.1.0       bit_4.0.4              compiler_4.0.2        
[16] cli_2.2.0              rvest_0.3.6            xml2_1.3.2             DelayedArray_0.16.0    labeling_0.4.2        
[21] scales_1.1.1           SQUAREM_2021.1         genefilter_1.72.0      mixsqp_0.3-43          digest_0.6.27         
[26] XVector_0.30.0         pkgconfig_2.0.3        dbplyr_2.0.0           invgamma_1.1           rlang_0.4.10          
[31] rstudioapi_0.13        RSQLite_2.2.1          farver_2.0.3           generics_0.1.0         jsonlite_1.7.2        
[36] BiocParallel_1.24.1    RCurl_1.98-1.2         magrittr_2.0.1         GenomeInfoDbData_1.2.4 Matrix_1.2-18         
[41] fansi_0.4.2            Rcpp_1.0.5             munsell_0.5.0          lifecycle_0.2.0        stringi_1.5.3         
[46] zlibbioc_1.36.0        grid_4.0.2             blob_1.2.1             crayon_1.3.4           lattice_0.20-41       
[51] haven_2.3.1            splines_4.0.2          annotate_1.68.0        hms_1.0.0              locfit_1.5-9.4        
[56] pillar_1.4.7           geneplotter_1.68.0     reprex_0.3.0           XML_3.99-0.5           glue_1.4.2            
[61] modelr_0.1.8           vctrs_0.3.6            cellranger_1.1.0       gtable_0.3.0           assertthat_0.2.1      
[66] xfun_0.20              xtable_1.8-4           broom_0.7.3            survival_3.1-12        truncnorm_1.0-8       
[71] tinytex_0.29           AnnotationDbi_1.52.0   memoise_1.1.0          ellipsis_0.3.1        

Upvotes: 0

Views: 2665

Answers (2)

r2evans
r2evans

Reputation: 160407

When you run into problems, start executing expressions from the outside inwards to find where things start going wrong.

rownames(resmakeup[abs(resmakeup$example) >= 2,])
# NULL
resmakeup[abs(resmakeup$example) >= 2,]
# [1]  4 -3  2

Okay, you cannot get row names from an integer vector.

The culprit here is R's default behavior to drop the dimensions of a data.frame when you select down to one column or one row. (FYI, both dplyr and data.table choose to not follow that frustrating behavior.) You can get around that with drop=FALSE.

resmakeup[abs(resmakeup$example) >= 2,, drop = FALSE]
#         example
# number1       4
# number2      -3
# number3       2

and therefore

rownames(resmakeup[abs(resmakeup$example) >= 2,, drop = FALSE])
# [1] "number1" "number2" "number3"

I'll take this opportunity to soap-box a base R function that also makes this easier, both to read and that it does not exhibit the drop= "feature": subset.

rownames(subset(resmakeup, abs(example) >= 2))
# [1] "number1" "number2" "number3"

Its use of non-standard evaluation (i.e., ability to use column names without the df$ leader, as in example) makes reading it a bit simpler, and it never drops.

Upvotes: 2

bouncyball
bouncyball

Reputation: 10761

This is an issue with subsetting a data.frame (see this help file for more information). You need to specify drop = FALSE in your data:

rownames(resmakeup[abs(resmakeup$example) >= 2,,drop = FALSE])
# [1] "number1" "number2" "number3"

If you inspect what running resmakeup[abs(resmakeup$example) >= 2,] returns, you'll notice that it's returning a vector and not a data.frame (coercing to lowest possible dimension). Using drop = FALSE will preserve the data.frame type after subsetting.

Upvotes: 1

Related Questions