Sahil Desai
Sahil Desai

Reputation: 3696

Extract particular word from string in R

I have a data.frame

 >df
        ID               NUM
  ABC.s4543rp.dfr54s     1234
  com.ffd54646.ABC       54646
  ABC                    554648     
  PQR                    13546
  dfsdf56.PQR            99874
  dsfsdff.df56.PQR       464655
  94348.PQR.564564d      456464
  MNO.dwee5555           54556
  sdfdfgfdg5.MNO         87895
  fdf.MNO.sf5e65         548644

So, here I want to extract particular words from ID. For example

            ID               NUM         Word
      ABC.s4543rp.dfr54s     1234        ABC
      com.ffd54646.ABC       54646       ABC
      ABC                    554648      ABC
      PQR                    13546       PQR
      dfsdf56.PQR            99874       PQR
      dsfsdff.df56.PQR       464655      PQR
      94348.PQR.564564d      456464      PQR
      MNO.dwee5555           54556       MNO
      sdfdfgfdg5.MNO         87895       MNO
      fdf.MNO.sf5e65         548644      MNO

I have some issues with above task. I think I have to prepare the word which I want to extract from ID. If you have solution, please let me know.

Upvotes: 1

Views: 5981

Answers (1)

akrun
akrun

Reputation: 887971

It is not clear about the conditions, perhaps str_extract can be used. If we have a list of specific words, this can be used in the pattern argument delimited with |.

library(stringr)
df$Word <- str_extract(df$ID, "ABC|PQR|MNO")

Or if it is any upper case words, use the pattern [A-Z]+ i.e. one or more upper case letters

str_extract(df$ID, "[A-Z]+")
#[1] "ABC" "ABC" "ABC" "PQR" "PQR" "PQR" "PQR" "MNO" "MNO" "MNO"

Upvotes: 3

Related Questions