Reputation: 1
I have transcript files from 6 different interviews, and am running a sentiment analysis on the text using VADER. The compound score for all the files was 1. This does not seem correct to me, but I'm not sure why this happened, or how to trouble shoot.
The code I have is:
for (i in MD_scripts) {
file_MD <- read_file(i)
gsub("[\r\n]", "", file_MD)
vader_MD <- get_vader(file_MD)
df_vader <- data.frame(rbind(df_vader, vader_MD))
}
The pos, neu, neg scores are also eerily similar, but not exactly the same. Any tips/ideas?
I thought of running VADER on individual sentences (successful in doing this) and trying to calculate the overall compound score by hand, but I could not figure out how to do that.
Upvotes: 0
Views: 166
Reputation: 76651
Here are two ways of correcting the code in the question.
One, use a for
loop. You will have to create a results list vader_list
beforehand.
library(vader)
vader_list <- vector("list", length = length(MD_scripts))
for (i in seq_along(MD_scripts)) {
file_MD <- MD_scripts[[i]] |>
readLines() |>
paste(collapse = " ")
vader_list[[i]] <- get_vader(file_MD)
}
You can also use a lapply
loop, which makes the code simpler.
library(vader)
vader_list <- lapply(MD_scripts, \(fl) {
fl |>
readLines() |>
paste(collapse = " ") |>
get_vader()
})
Upvotes: 0