Reputation: 477
So I have a column (species_names) with several names and I have these lines to remove several expressions from the characters in my df:
df$species_name<-gsub("[0-9]+.+", "", df$species_name)
df$species_name<-gsub("[a-z,A-Z]+[0-9]+.+", "", df$species_name)
df$species_name<-sub("^[A-Z,a-z] ", "", df$species_name)
df$species_name<-gsub("^[A-Z,a-z][A-Z,a-z] ", "", df$species_name)
df$species_name<-gsub(" [A-Z,a-z]$", "", df$species_name)
df$species_name<-gsub(" [A-Z,a-z][A-Z,a-z]$", "", df$species_name)
df$species_name<-gsub("[0-9]+.*", "", df$species_name)
df$species_name<-gsub("[a-z,A-Z]+[0-9]+.*", "", df$species_name)
df$species_name<-gsub("[0-9]+", "", df$species_name)
df$species_name<-gsub(" +$", "", df$species_name)
df$species_name<-gsub("-", "", df$species_name,fixed=TRUE)
df$species_name<-gsub(" sp.", "", df$species_name,fixed=TRUE)
df$species_name<-gsub(" sp. nov", "", df$species_name,fixed=TRUE)
df$species_name<-gsub(" cf.", "", df$species_name,fixed=TRUE)
df$species_name<-gsub(" complex.", "", df$species_name,fixed=TRUE)
df$species_name<-gsub(" cmplx.", "", df$species_name,fixed=TRUE)
df$species_name<-gsub(" cmplx$", "", df$species_name)
df$species_name<-gsub(" pr.", "", df$species_name,fixed=TRUE)
df$species_name<-gsub(" f.", "", df$species_name,fixed=TRUE)
df$species_name<-gsub(" nr.", "", df$species_name,fixed=TRUE)
df$species_name<-gsub(" s.l.", "", df$species_name,fixed = TRUE)
df$species_name<-gsub(" grp.", "", df$species_name,fixed=TRUE)
df$species_name<-gsub(" [A-Z]+.+$", "", df$species_name)
df$species_name<-gsub(" type", "", df$species_name,fixed=TRUE)
df$species_name<-gsub(" group", "", df$species_name,fixed=TRUE)
Is there any way to make this process a little cleaner, instead of having so many lines? Since this code is inside a function in a shiny app, I wondered if maybe you could have another simpler approach. Thanks in advance for any answers
Upvotes: 0
Views: 84
Reputation: 2650
since you have one replacement, you could use |
(or) like this for example:
patterns <- c('v[ei]r', 'sa')
expr = paste0(patterns, collapse = '|')
gsub(expr, '00', iris$Species)
Upvotes: 3