Reputation: 1
I wonder if you can help me extract a part of a string using R. I have some column d with the following elements :
d<-
[1] Homo sapiens (Human)
[2] Pan troglodytes (Chimpanzee)
[3] Pan troglodytes (Chimpanzee)
[4] Nomascus leucogenys (Northern white-cheeked gibbon) (Hylobates leucogenys)
[5] Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey)
[6] Macaca mulatta (Rhesus macaque)
[7] Macaca mulatta (Rhesus macaque)
[8] Callithrix jacchus (White-tufted-ear marmoset)
I want to select every thing before the brackets, i.e the answer would be
d<-
[1] Homo sapiens
[2] Pan troglodytes
[3] Pan troglodytes
[4] Nomascus leucogenys
[5] Macaca fascicularis
[6] Macaca mulatta
Thanks
Upvotes: 0
Views: 3121
Reputation: 5424
Also, stringr is a great package.
library(stringr)
s <- "Homo sapiens (Human)"
t <- str_match(s, "^(.+)\\s\\(")[2]
t
[1] "Homo sapiens"
Upvotes: 3
Reputation: 546073
The easiest way in R is to delete everything starting at the parenthesis (including the preceding whitespace, if any):
result = sub(' *\\(.*$', '', d)
Upvotes: 3