Searching key terms (Corpus) into another in R

Question

I asked this question before and got negative feedback because I did not provide code. I spent the whole day trying and trying and now I am stuck in an issue.

This code has been fetched by a user in Stackoverflow "Tyler Rincker" <- big thanks to him!

here is the code:

strip <- function(x, digit.remove = TRUE, apostrophe.remove = FALSE){
          strp <- function(x, digit.remove, apostrophe.remove){
            x2 <- Trim(tolower(gsub(".*?($|'|[^[:punct:]]).*?", "\1", as.character(x))))
            x2 <- if(apostrophe.remove) gsub("'", "", x2) else x2
            ifelse(digit.remove==TRUE, gsub("[[:digit:]]", "", x2), x2)
          }
          unlist(lapply(x, function(x) Trim(strp(x =x, digit.remove = digit.remove, 
                              apostrophe.remove = apostrophe.remove)) ))

corpus2 <- "In Westerman's disruptive article, Quantitative research as 
        an interpretive enterprise: The mostly 
        unacknowledged role of interpretation in research efforts."

    corpus2 <- gsub("\s+", " ", gsub("
|	", " ", corpus2))
    corpus2.wrds <- as.vector(unlist(strsplit(strip(corpus2), " ")))

    corpus2.Freq <- data.frame(table(corpus2.wrds))
    corpus2.Freq$corpus2.wrds  <- as.character(corpus2.Freq$corpus2.wrds)
    corpus2.Freq <- corpus2.Freq[order(-corpus2.Freq$Freq), ]
    rownames(corpus2.Freq) <- 1:nrow(corpus2.Freq)

    key.terms <- c("research as")

My issue is that i want to search for bigrams or trigram (2 or 3 words) in the corpus.

When i execute this line of code:

corpus2.Freq[corpus2.Freq$corpus2.wrds %in%key.terms, ]

I get this results which should show a frequency of "1".

[1] corpus2.wrds Freq        
<0 rows> (or 0-length row.names)

However, if the keyterm is only 1 word:

key.terms <- c("research")
    corpus2.Freq[corpus2.Freq$corpus2.wrds %in%key.terms, ]

the code is working fine and i get the following result:

corpus2.wrds Freq
research    2

Thanks a lot! and hopefully someone can help.

Searching key terms (Corpus) into another in R

Answers (1)

Related Questions