Reputation: 83

how to use stringr package function str_subset to subset rows in a data table

Consider the below data table CM

CM<-read.table(header=TRUE,
 stringsAsFactors = FALSE,
 text ="
subject cmdecod
001 CARBAMAZEPINE
001 DOXORUBICIN
001 CARBAZINE
001 VINCRISTINE
002 PILSICAINIDE
002 PHENOBARBITAL
002 BUTALBITAL
002 RIFAP"
 )

I want to subset this on cmdecod='RIFAP' using the str_subset(). Please help me.

I tried str_detect() as below, not sure why str_subset does not wok here

CM2 <- filter(CM,str_detect(cmdecod, "RIFAP"))
CM2

Upvotes: 1

Answers (2)

TarJae

Reputation: 79246

Additional to the great answer of akrun here some more information:

library(stringr)
str_subset(CM$cmdecod, "RIFAP")

str_subset is a wrapper around:

CM$cmdecod[str_detect(CM$cmdecod, "RIFAP")]

and is equivalent to:

grep("RIFAP", CM$cmdecod, value=TRUE)

https://stringr.tidyverse.org/reference/str_subset.html

Upvotes: 1

akrun

Reputation: 887901

str_subset doesn't return a logical vector. It just extracts the subset. According to ?str_subset

str_subset() is a wrapper around x[str_detect(x, pattern)], and is equivalent to grep(pattern, x, value = TRUE)

whereas filter expects a logical vector to filter the rows

library(stringr)
str_subset(CM$cmdecod, "RIFAP"))
[1] "RIFAP"

Thus, it is used a way to subset a vector of values from the original vector. The "RIFAP" is an actual input in the original vector and thus it got subsetted

CM$cmdecod
[1] "CARBAMAZEPINE" "DOXORUBICIN"   "CARBAZINE"     "VINCRISTINE"  
[5] "PILSICAINIDE"  "PHENOBARBITAL" "BUTALBITAL"    "RIFAP"

Upvotes: 2

how to use stringr package function str_subset to subset rows in a data table

Answers (2)

Related Questions