branch.lizard
branch.lizard

Reputation: 595

Word Frequency to DataFrame from strings in R

I would like to take a few string vectors and have the frequency of the words found in the vectors as a data frame. The column names of the dataframe should be the unique words found in all of the strings combined. I have this part, it is the frequency of these words being added to the data frame that is getting me. This is a very scaled down version of what I am attempting. I have tried using table(), but I am not sure I am on the right direction.

a <- c('A', 'B', 'C', 'D', 'E')
b <- c('A', 'D', 'J', 'G', 'X')
c <- c('A', 'A', 'B', 'B', 'C', 'X')

Example Data.Frame Design

vector.name  A  B  C  D  E  J  G  X 
a            1  1  1  1  1  0  0  0
b            1  0  0  1  0  1  1  1
c            2  2  1  0  0  0  0  1

Upvotes: 2

Views: 445

Answers (2)

thelatemail
thelatemail

Reputation: 93938

This is essentially one table operation once you have a long dataset:

table(stack(mget(c("a","b","c")))[2:1])

#   values
#ind A B C D E G J X
#  a 1 1 1 1 1 0 0 0
#  b 1 0 0 1 0 1 1 1
#  c 2 2 1 0 0 0 0 1

Upvotes: 1

Maurits Evers
Maurits Evers

Reputation: 50728

This should work

countUniqueEntries <- function(l) {
    lapply(l, function(x) {
        x <- factor(x, levels = unique(unlist(l)));
        table(x) })
}

do.call(rbind, countUniqueEntries(list(a, b, c)));
     A B C D E J G X
[1,] 1 1 1 1 1 0 0 0
[2,] 1 0 0 1 0 1 1 1
[3,] 2 2 1 0 0 0 0 1

Upvotes: 3

Related Questions