nityansh seth
nityansh seth

Reputation: 41

Replace the string value with value in the find list in R

I have a dataset that has a column like

   string<-c('lib1_Rstudio_case1','lib2_Rstudio_case1and2','lib5_python_notthe correct_language','lib3_Jupyter_really_good','lib1_spyder_nice','lib1_R_the_core')
   replacement<-c('Rstudio','Jupyter','spyder','R')

I want to replace the string value id they match the value in replacement. I am using the following code right now

gsub(paste(replacement, collapse = "|"), replacement = replacement, x = string)

This in another piece of code which i am using to find the cases

string[grepl(paste(replacement, collapse='|'), string, ignore.case=TRUE)]

I want to update the ones that I find I want the output to be like

Rstudio,Rstudio,'',Jupyter,spyder,R

I don't want to do it by hard coding it. I want to write a code that is scalable.

Any help is really appreciated

thanks in advance

Upvotes: 1

Views: 6193

Answers (2)

nityansh seth
nityansh seth

Reputation: 41

This another simple code I used. That doesn't need the regex function.Thanks for the help

string<-c('lib1_Rstudio_case1','lib2_Rstudio_case1and2','lib5_python_notthe correct_language','lib3_Jupyter_really_good','lib1_spyder_nice','lib1_R_the_core')
replacement<-c('R','Jupyter','spyder','Rstudio')
replaced=string
replaced=''


for (i in 1:length(replacement))
{
  replaced[which(grepl(replacement[i],string))]=replacement[i]
}
replaced[is.na(replaced)]=''

Upvotes: 1

Sathish
Sathish

Reputation: 12723

isolate id using gsub function and then find id that is not matching the length of replacement by means of is.na function. Then replace the identified id with empty character ''.

EDIT: Since you changed the string data in the question, I modified the gsub function. The pattern used in gsub function will find the numeric value right after lib text and omit the remaining part of the string element.

replacement<-c('Rstudio','Jupyter','spyder','R')

string<-c('lib1_Rstudio','lib2_Rstudio','lib5_python','lib3_Jupyter','lib1_spyder','lib1_R')
index <- is.na( replacement[ as.integer( gsub( "lib([[:digit:]])*[[:alnum:]_\ ]*", "\\1", string)) ] )
a1 <- sapply( strsplit(string, "_"), function( x ) x[2] )
a1[ index ] <- ''
a1
# [1] "Rstudio" "Rstudio" ""        "Jupyter" "spyder"  "R"    

string <- c('lib1_Rstudio_case1','lib2_Rstudio_case1and2','lib5_python_notthe correct_language','lib3_Jupyter_really_good','lib1_spyder_nice','lib1_R_the_core')
index <- is.na( replacement[ as.integer( gsub( "lib([[:digit:]])*[[:alnum:]_\ ]*", "\\1", string)) ] )
a1 <- sapply( strsplit(string, "_"), function( x ) x[2] )
a1[ index ] <- ''
a1
# [1] "Rstudio" "Rstudio" ""        "Jupyter" "spyder"  "R"

Upvotes: 1

Related Questions