Standardizing the number of words by the length of the article for sentiment analysis

Question

I'm conducting sentiment analysis with a dictionary. The data is newspaper articles, so my unit of analysis needs to be the story and not the sentence (my prof corrected me on this). He also said I needed to standardize the number of words by the length of the article before doing my regression analysis. This is the part I'm struggling with.

#clean the data a bit and tokenize
data.C <- read.csv("filelocation",stringsAsFactors = FALSE)
data.C <- data.frame(data.C)
data.C <- data.C %>%
  unnest_tokens(word, sentence)

data.C <- data.C %>%
  anti_join(stop_words, by= "word")

# obtain sentiment scores

struggle <- data.frame(data.C) %>%
  inner_join(get_sentiments("afinn"))

struggle %>%
  group_by(story_index) %>%
  summarize(weights(sum(value)/n()))

This last command gets me the error message:

Error in summarize(): ! Problem while computing ..1 = weights(sum(value)/n()). i The error occurred in group 1: story_index = 1. Caused by error: ! $ operator is invalid for atomic vectors

**Also, here's what my data set looks like. The third variable is the sentiment score. Snapshot of dataset

Standardizing the number of words by the length of the article for sentiment analysis

Answers (0)

Related Questions