Gaurav Singhal
Gaurav Singhal

Reputation: 1096

dynamically assign a factor/string name inside a function in R

I want to output multiple PNG files and I am clueless on how to name them. Basically I want to make multiple word clouds using one data frame. One column contains the data from which word clouds have to be made and one column contains categories for which word clouds are to be made. I have written a function to create a single word cloud. Now I want to save all the word clouds in one go using tapply (or something else) and each word cloud should get a name like category_name.png This is a reproducible version of the code that I have written

library(tm)
library(wordcloud)

String_column = rep(c("hello round world", "beautiful round world", "good girls world", "unfair mean world", "hi girls hello","sad girls sorry"),6)
Category_column = rep(c("Neutral", "Pos", "Pos", "Neg", "Neutral","Neg"),6)

getCloudData <- function(StrCol,CatCol){
    answer_text <- paste(StrCol, collapse=" ") 
    answer_source = VectorSource(answer_text)
    corpus = Corpus(answer_source)
    dtmWords = DocumentTermMatrix(corpus)
    matrixWords = as.matrix(dtmWords)

    freqWords = colSums(matrixWords)
    freqWords = sort(freqWords, decreasing = TRUE)
    words <- names(freqWords)
    png("C:\\Users\\GSinghal\\Downloads\\Text Mining\\catagory_xyz.png")
    wordcloud(words[1:3], freqWords[1:3])
    dev.off()
}

getCloudData(String_column,Category_column)
tapply(String_column, Category_column,getCloudData)

Now when I use tapply I want all three files to be saved with names Neutral.png, Pos.png and Neg.png. Actual data contains around 11,000 strings and around 20 to 65 categories.

Upvotes: 0

Views: 99

Answers (1)

Chris
Chris

Reputation: 6372

Rewriting your function a bit, and using data.table instead of tapply can work:

library(data.table)

DT <- data.table(Cat_Col = Category_column, Str_Col = String_column)

getCloudData <- function(StrCol,Category){
  answer_text <- paste(StrCol, collapse=" ") 
  answer_source = VectorSource(answer_text)
  corpus = Corpus(answer_source)
  dtmWords = DocumentTermMatrix(corpus)
  matrixWords = as.matrix(dtmWords)
  freqWords = colSums(matrixWords)
  freqWords = sort(freqWords, decreasing = TRUE)
  words <- names(freqWords)
  png(paste(unique(Category),".png", sep = "")) #changed this part
  wordcloud(words[1:3], freqWords[1:3])
  graphics.off() #instead of dev.off
}

DT[ , getCloudData(Str_Col,Cat_Col), by = "Cat_Col"]

This uses the by function of data.table to apply your getCloudData function across all categories.

You should then see all of the images in your current working directory.

Upvotes: 1

Related Questions