Word count for text in column

Question

I have a dataset with a column containing text as follows

    Column1
    ----------------------------------------------------------
    dapagliflozin 10 MG / metFORMIN hydrochloride 
    dapagliflozin 5 MG / metFORMIN hydrochloride  
    Fortamet       
    Glucophage      
    Glumetza      
    metFORMIN hydrochloride      
    metFORMIN hydrochloride  / pioglitazone 15 MG     
    metFORMIN hydrochloride  / pioglitazone 30 MG

I am trying to obtain the word count for every unique word, for example, word count for metFormin, word count for hydrochloride, etc. I need help; I tried table function, but it uses the whole row as one word and that's not helpful.

akrun · Accepted Answer

We can use a combination of strsplit/unlist/table. Split the column strings with strsplit specifying the split as space (\s+). The output will be a list. Use unlist to change the list to vector and then use table to get the count.

 table(unlist(strsplit(yourdf$Column1, '\s+'))

Word count for text in column

Answers (2)

Related Questions