nathan
nathan

Reputation: 191

How to parse entire dataframe for key word in R

  V16      V17      V18
nm:i:18  ms:i:40  as:i:40
ms:i:30  as:i:25  nn:i:0
ms:i:40  as:i:40  nn:i:0

I have three main columns which contain values that are tagged using either ms, as, or nn. I want to be able to get for each row anything containing ms and compare it to as.

I tried grepl, subset, and which. Not sure whats the best way to compare these?

For example:

  V17     V18
ms:i:40 as:i:40

OR

  V16     V17
ms:i:30 as:i:25

Expected (create new columns with the values sorted):

  V19      V20
ms:i:40  as:i:40
ms:i:30  as:i:25  
ms:i:40  as:i:40  

Upvotes: 0

Views: 111

Answers (1)

akash87
akash87

Reputation: 3994

I would think maybe a dplyr solution would be most effective:

df <- data.frame(ID = 1:3, 
                 V16 = c("nm:i:18", "ms:i:30", "ms:i:40"), 
                 V17 = c("ms:i:40", "as:i:25", "as:i:40"), 
                 V18 = c("as:i:40", "nn:i:0", "nn:i:0"))

df %>% 
gather(id, var, V16:V18) %>% 
filter(grepl("ms|as", var)) %>% 
mutate(newID = ifelse(grepl("ms", var), "V19", "V20")) %>% 
dplyr::select(-id) %>% 
spread(newID, var)

  ID     V19     V20
1  1 ms:i:40 as:i:40
2  2 ms:i:30 as:i:25
3  3 ms:i:40 as:i:40

Upvotes: 1

Related Questions