Reputation: 131
I try to create a function which returns the number of occurence of a word in a text file. For this, I create a List which contains all the words of a text. (a, c,, c , d, e, f are here in the example words)
[[1]]
[1] a
[2] f
[3] e
[4] a
[[2]]
[1] f
[2] f
[3] e
I create a table to stock for each word it number of occurence value
table(unlist(list))
a b c d e
3 3 2 1 1
My question now is how can I extract the value of occurence of a word in parameter. The function will have this structure
GetOccurence <- function(word, table)
{
return(occurence)
}
Any idea please to help me, Thanks in advance
Upvotes: 2
Views: 3471
Reputation: 18625
To answer the question with respect to your function you could take the following approach.
For the sake of reproducibility, I used publicly-available data and cleaned it a little.
library(tm)
data(acq)
# Basic cleaning
acq <- tm_map(acq, removePunctuation)
acq <- tm_map(acq, removeNumbers)
acq <- tm_map(acq, tolower)
acq <- tm_map(acq, removeWords, stopwords("english"))
acq <- tm_map(acq, stripWhitespace)
acq <- tm_map(acq, PlainTextDocument)
# Split list into words
wrds <- strsplit(paste(unlist(acq), collapse = " "), ' ')[[1]]
# Table
tblWrds <- table(wrds)
GetOccurence <- function(word, table) {
occurence <- as.data.frame(table)
occurence <- occurence[grep(word, occurence[,1]), ]
return(occurence)
}
This function will match the full words only, the solution below capitalises on this answer.
GetOccurence <- function(word, table) {
occurence <- as.data.frame(table)
word <- paste0("\\b", word, "\\b")
occurence <- occurence[grep(word, occurence[,1]), ]
return(occurence)
}
Upvotes: 4