Reputation: 1
I have a complex names of big matrix. I'm supposed to replace the name by splitting the name of each column which are separated by "_".
sample of name
d__Bacteria.p__Firmicutes.c__Clostridia.o__Lachnospirales.f__Lachnospiraceae.g__Tuzzerella.__
now my target is to extract only family name of each group ending "aceae" (number 6 name) in all columns names and replace instead of such a big complex name.
may I ask you to help me?
I made vectors of columns and rows, and used library(stringr) strsplit(colname_matrix, "_")
I have a list of split names but I do not know how I remove the rest and just keep names ending with "aceae" and apply it for all names in columns and rows. matrix is symmetrical
Upvotes: 0
Views: 50
Reputation: 498
x<-"d__Bacteria.p__Firmicutes.c__Clostridia.o__Lachnospirales.f__Lachnospiraceae.g__Tuzzerella.__ "
library(stringr)
str_extract(x, "(?<=f__)[^.g]+")
Base R if u do not want "aceae"
sub(".*.f__","",sub("aceae.*", "", x))
or
y<-as.vector(str_split(x,"__"))
y[[1]][str_detect(y[[1]], "aceae")]
Upvotes: 0