Hank Lin
Hank Lin

Reputation: 6479

Deleting nth delimiter in R

I am trying to delete the 5th delimiter in this string:

"Bacteria_Firmicutes_Clostridia_Clostridiales_Rumino_coccaceae_Ruminococcus_Ruminococcus_albus"

so it becomes:

"Bacteria_Firmicutes_Clostridia_Clostridiales_Ruminococcaceae_Ruminococcus_Ruminococcus_albus"

This seems to work, but I feel like there should be a more elegant solution possibly with regex and str_replace

library(stringr)
name <- "Bacteria_Firmicutes_Clostridia_Clostridiales_Rumino_coccaceae_Ruminococcus_Ruminococcus_albus"
index <- str_locate_all(name, "_")[[1]]
str_sub(name, index[5, "start"], index[5, "end"]) <- ""
name

Upvotes: 2

Views: 51

Answers (1)

U13-Forward
U13-Forward

Reputation: 71570

Try gsub:

> gsub("((?:[^_]+_){4}[^_]+)_", "\\1", name)
[1] "Bacteria_Firmicutes_Clostridia_Clostridiales_Ruminococcaceae_Ruminococcus_Ruminococcus_albus"
> 

Or a less "pretty" way:

> gsub("([^_]*_[^_]*_[^_]*_[^_]*_[^_]*)_", "\\1", name)
[1] "Bacteria_Firmicutes_Clostridia_Clostridiales_Ruminococcaceae_Ruminococcus_Ruminococcus_albus"
> 

Or with the strex library:

> library(strex)
> paste(str_before_nth(name, "_", 5), str_after_nth(name, "_", 5), sep="")
[1] "Bacteria_Firmicutes_Clostridia_Clostridiales_Ruminococcaceae_Ruminococcus_Ruminococcus_albus"
> 

Upvotes: 1

Related Questions