Ace
Ace

Reputation: 35

Using R to filter comments for text mining

I am using R and relatively new to programming so any help will be appreciated.

I am text mining for a survey and would like to filter comments with a combination of words. The data set has been read from a csv file.

I want to filter the comments that contain the words "abroad" and "charges"

I am using the grepl function to recognise the pattern with in the comments. I have managed to filter the data in the Comment section which has the words "abroad" and "charges" by using the following code:

ac <- filter(data, grepl("abroad|charges", Comment))

  ac$Comment

This returns comments with words "abroad" and "charges" but it returns comments which can either have "abroad" or "charges". I would like a combination of both words. I tried replacing | with & but this does not work.

I have also tried subset:

ac <- subset(data, Comment %in% c("abroad", "charges"))

ac$Comment

None of these return the desired results. Am I missing something obvious? How can I view comments that contain only certain words in them. So if I further wanted to explore my text I could try to find the combination of "abroad" and "charges" and "expensive."

Thanks any help would be great.

Upvotes: 2

Views: 975

Answers (1)

akrun
akrun

Reputation: 887108

We can use a double grep with & operator inside the filter and it should only be TRUE for words that contain both 'abroad' and 'charges' in the string.

 filter(data, grepl("abroad", Comment) & grepl('charges', Comment))

Upvotes: 1

Related Questions