r gsub extract n words before and after a term

Question

I need to extract n words that appear before and after a term for a text analysis that I'm working on. Below is a reproducible example:

a <- c("The day was nice and dry, when she came for our game we were ready and then she left.",
"The day was nice and dry, when she came for our game, but we were not ready. She left after she waited 5 minutes.",
"The day was nice and dry, when she came, we were not here. Our game  was not completed timely, but it was completed after one hour.")

Below is the function that Im using but it does not work for situations where there is punctuation around a word or double spaces.

gsub(".*(( \w{1,}){3} game( \w{1,}){3}).*", "\1", a, perl = TRUE)


[1] " came for our game we were ready"                                                                                                  
[2] "The day was nice and dry, when she came for our game, but we were not ready. She left after she waited 5 minutes."                 
[3] "The day was nice and dry, when she came, we were not here. Our game  was was not completed timely, but it was completed after one hour."

below is the desired output

[1] " came for our game we were ready"                                                                                                  
[2] " came for our game, but we were"                 
[3] " not here. Our game was not completed"

thc · Accepted Answer

Instead of using space, try \W{1,}:

gsub(".*(((\W{1,})\w{1,}){3} game((\W{1,})\w{1,}){3}).*", "\1", a, perl = TRUE)

[1] " came for our game we were ready"       
" came for our game, but we were"        
" not here. Our game  was not completed"

r gsub extract n words before and after a term

Answers (2)

Related Questions