Simon Shin
Simon Shin

Reputation: 15

Word Cloud in R, R removing my specific words

I am working on my word cloud using R. In my data, I have many "AI" but the word cloud does not recognize this word. ChatGPT respond that I have 43 AI in my data. Here is my code:

install.packages("wordcloud")
install.packages("tm")
install.packages("RColorBrewer")
library("wordcloud")
library("tm")
library(ggplot2)
library(readxl) 
library(RColorBrewer)

#####Good Source: https://www.youtube.com/watch?v=oVVvG035vQc

##################################
# Read the data from the Excel file
data <- read_excel("/Users/home/Dropbox/ATD/ATD24 In-Person & Virtual for Delegations.xlsx", sheet = "1. Schedule")

# Extract the session title column
# Replace 'session' with the exact name of your column
titles <- data$`Session Title (Session)`
# Create a text corpus
corpus <- Corpus(VectorSource(titles))

# Define custom stopwords
my_stopwords <- c("learning", "development", "inperson", "streamed", "live", "design", "training", "new", "better", "can", "beyond", "like", "know", "without", "almost", "don't")


# Define a color palette
colors <- brewer.pal(8, "Dark2")  # Choose a palette from RColorBrewer
# To get more colors you can repeat the palette
colors <- colorRampPalette(colors)(100)  # Adjust the number to control color variations

# Preprocess the text data
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeNumbers)
corpus <- tm_map(corpus, removeWords, c(stopwords("english"), my_stopwords))
corpus[[83]][1]
##create table
tdm <- TermDocumentMatrix(corpus)
m<- as.matrix(tdm)
v<-sort(rowSums(m),decreasing = TRUE)
d<-data.frame(word=names(v),freq=v)
# View(d)
# Generate the word cloud
wordcloud(corpus, min.freq=1,max.words = 500, random.order = FALSE, rot.per = 0.35, scale = c(2, 0.1), use.r.layout=FALSE,colors=brewer.pal(8, "Set2"))


For example, I have "Harness AI to Transform Your Learning Ecosystem" in 83th row, I got d that I have ecosystem, harness, learning, transform, and your. SO I lost AI only (I think to is removed due to my stopwords).

tdm <- TermDocumentMatrix(corpus[[83]])
m<- as.matrix(tdm)
v<-sort(rowSums(m),decreasing = TRUE)
d<-data.frame(word=names(v),freq=v)

Thank you for your help in advance.

R in word cloud remove the specific words

Upvotes: 1

Views: 55

Answers (0)

Related Questions