How to count documents in which two words appear in close proximity in R?

Question

I would like to count documents in which two strings appear within a set distance, within 10 words of each other. Let's say 'German*' and 'War'. I do not want to count the times they appear in total, but only the number of documents in which the set appears (if it appears once, count it as one).

I know how to count documents that contain a word. But I am not sure whether I need to extract 10-grams and see whether the two words appear and then count this per document, or if there is a more efficient way.

How to count documents in which two words appear in close proximity in R?

Answers (1)

Related Questions