PsychometStats
PsychometStats

Reputation: 380

R: grep multiple strings at once

I have a data frame with 1 variable and 5,000 rows, where each element is a string.

1. "Am open about my feelings."                   
2. "Take charge."                                 
3. "Talk to a lot of different people at parties."
4. "Make friends easily."                         
5. "Never at a loss for words."                   
6. "Don't talk a lot."                            
7. "Keep in the background."                      
   .....
5000. "Speak softly."           

I need to find and output row numbers that correspond to 3 specific elements. Currently, I use the following:

grep("Take charge." ,  df[,1]) 
grep("Make friends easily.",  df[,1])  
grep("Make friends easily.",  df[,1])  

And get the following output: [1] 2 [2] 4 [3] 5000

Question 1. Is there a way to make syntax more succinct, so I do not have to use grep and df[,1] on every single line?

Questions 2. If so, how to output a single numerical array of the necessary row positions, so the result would look something like this?

2, 4, 5000

What I tried so far.
grep("Take charge." , "Make friends easily.","Make friends easily.",
df[,1]) # this didn't work

I tried to create a vector, called m1, that contains all three elements and then grep(m1, df[,1]) # this didn't work either

Upvotes: 0

Views: 5891

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 270248

Since these are exact matches use this where phrases is a character vector of the phrases you want to match:

match(phrases, df[, 1])

This also works provided no phrase is a substring of another phrase:

grep(phrases, df[, 1])

Upvotes: 3

Related Questions