how can I remove part of a names in one column of a data frame?

Question

I have a data looks like this

v1                                         v2
phenzine.MO.4213121906560.C02.name  2.376140e-05
dnium.bte.MO.02400072107987.E10.name    2.423254e-05
trene.MO.024213121906564.C09.name       2.438986e-05
tilli.MO.550760072207033.F09.name       2.495574e-05
tnolone.MO..614615111406.name           2.511859e-05

I want to remove part of the first column which then it will looks like below

      v1              v2
    phenzine    2.376140e-05
    dnium.bte   2.423254e-05
    trene       2.438986e-05
    tilli       2.495574e-05
    tnolone     2.511859e-05

I know I must use grep or sub but I could not do it

akrun · Accepted Answer

You can try the below regex if 'MO' is common for all the elements

 df1$v1 <- sub('\.MO.*', '', df1$v1)

Suppose, you want to remove the strings from . followed by first capital letter

 sub('\.[A-Z].*', '', df1$v1)
 #[1] "phenzine"  "dnium.bte" "trene"     "tilli"     "tnolone"

Or if it is more specific

sub('\.(MO|NO|NR).*', '', df1$v1)
#[1] "phenzine"  "dnium.bte" "trene"     "tilli"     "tnolone"

how can I remove part of a names in one column of a data frame?

Answers (1)

Related Questions