student123
student123

Reputation: 13

How do I turn the values in a subset into text so that I can generate a wordcloud?

From a large tabular data set (using read.csv function) I have created a subset from a column that lists different job roles in the rows.

View(jobs_mysubset)

.....

\[995\] physician

\[996\] painter

\[997\] engineer

\[998\]

\[999\] architect

\[1000\]

\[ reached getOption("max.print") -- omitted 2634 entries \]

I would now like to turn this list into a wordcloud but have been unable to do so using the following:

library(wordcloud2)

wordcloud2(data=jobs_mysubset, size=5)

However, this gives me the following error: Error in \[.data.frame`(data, , 1:2) : undefined columns selected`

How do I resolve this error? I don't think R recognises the words as data. How can I fix that?

I have also tried

jobs_mysubset <- text_dataframe %>% unnest_tokens(word, text) %>% anti_join(stopwords("en") jobs_mysubsetf = freq_dataframe %>% count(word)

This gives me the error code: Error: unexpected symbol in:

Error: unexpected symbol in: "jobs_mysubset \<- text_dataframe %\>% unnest_tokens(word, text) %\>% anti_join(stopwords("en") jobs_mysubset"

Upvotes: 0

Views: 26

Answers (1)

Luciefromdafuture
Luciefromdafuture

Reputation: 31

First you will need some packages :

library(tm) 
library(wordcloud) 

I create a data.frame called subset to show you.

subset=data.frame(sample(c("text1","text2","text3","-","BiG","StaCk","OtherWord","AnyIdea","I"),size = 100,replace=TRUE))

I could already do the cloud by doing

wordcloud(subset, max.words = 200, colors = brewer.pal(8, "Dark2"),  rot.per=0,random.color = TRUE)

But there is some updates to do for increasing the quality results:

text <- Corpus(VectorSource(subset)) #change class
text <- tm_map(text, content_transformer(tolower)) #put all the letter lower
text <- tm_map(text, removePunctuation) #remove ponctuation
text <- tm_map(text, function(x)removeWords(x,stopwords(kind = "fr"))) #remove the basic word. I putted fr for french my language but I guess you can use en for english...

And finally

wordcloud(text, max.words = 200, colors = brewer.pal(8, "Dark2"),  rot.per=0,random.color = TRUE)

There is a lot of settings you can change, as the color, the scale, of you can removing manually word like this:

text2 <- tm_map(text, function(x)removeWords(x,c("word","to","delete","here"))) 

wordcloud(text_corpus2, max.words = 200, colors = brewer.pal(8, "Dark2"), rot.per=0)

If you can't solve your problem, let me know !

Have a great day :)

Upvotes: 0

Related Questions