user8831872
user8831872

Reputation: 383

Count the word frequency of every phrase per row in dataframe

I have a dataframe like this:

DF <- data.frame(phrase = c("text 1","this text 2", "text 3"))

and I would like to create a column which will contain the number of phrases with exist in every row in the dataframe. Example from the previous respectively 2,3,2

What I tried is this

library(data.table)

dfN<- setDT(DF)[, c('phrase') :=tstrsplit(phrase, '(?<=[^0-9])', perl=TRUE, type.convert=TRUE)]

but I receive this error

Error in [.data.table(setDT(DF), , :=(c("phrase"), tstrsplit(phrase, : Internal logical error. Up front checks (before starting to modify DT) didn't catch type of RHS ('list') assigning to factor column 'phrase'. Please report to datatable-help. In addition: Warning message: In [.data.table(setDT(DF), , :=(c("phrase"), tstrsplit(phrase, : Supplied 11 items to be assigned to 3 items of column 'phrase' (8 unused)

Upvotes: 1

Views: 222

Answers (1)

akrun
akrun

Reputation: 887741

We can use str_count

library(stringr)
setDT(DF)[, newcol := str_count(phrase, "\\w+")]

Upvotes: 1

Related Questions