Reputation: 7879
I have a custom function. When I run the function manually, it returns a data frame:
> create_sentiment_df('taggreason', 'republican', 'lost')
Joining, by = "word"
Joining, by = "word"
sentiment prop.sentiment twitter.name party election.result
1 anger 0.04721931 taggreason republican lost
2 anticipation 0.14375656 taggreason republican lost
3 disgust 0.01259182 taggreason republican lost
4 fear 0.06190976 taggreason republican lost
5 joy 0.09024134 taggreason republican lost
6 negative 0.10073452 taggreason republican lost
7 positive 0.26862539 taggreason republican lost
8 sadness 0.03777545 taggreason republican lost
9 surprise 0.03882476 taggreason republican lost
10 trust 0.19832109 taggreason republican lost
However, I want to run this multiple times, so I am using mapply
on each row of a data frame. Here is the data, with a data frame of only 1 row (for testing):
> datt1
# A tibble: 1 x 4
twtr_handle party result district_flipped
<chr> <chr> <chr> <chr>
1 taggreason republican lost flipped
And then the function call:
rslt <- mapply(create_sentiment_df, datt1$twtr_handle, datt1$party, datt1$result)
Which returns:
> rslt
taggreason
sentiment Character,10
prop.sentiment Numeric,10
twitter.name Character,10
party Character,10
election.result Character,10
or:
Below is the function. It requires twitter authorization, so I'm not sure how it can easily be rerun. Is there anything about the function itself that would make mapply
return a list rather than a data frame?
library(rtweet)
library(tidytext)
library(tidyverse)
library(BBmisc)
library(reshape)
create_token(
app = "Flippable Sentiment Analysis",
consumer_key = c_k,
consumer_secret = c_s,
access_token <- a_t,
access_secret <- a_s)
create_sentiment_df <- function(twitter.name, party, election.result) {
va_stop_words <- stop_words %>% select(-lexicon) %>%
bind_rows(data.frame(word = c("https", "t.co", "rt", "amp")))
nrc_lex <- get_sentiments("nrc") # many sentiments
dat <- get_timeline(twitter.name, n=3200)
dat$created_at <- as.Date(dat$created_at)
dat_2017 <- subset(dat, created_at > as.Date('2017-01-01') & created_at < as.Date('2017-11-06'))
dat_words <- dat_2017 %>%
select(status_id, text) %>%
unnest_tokens(word,text)
dat_words_interesting <- dat_words %>% anti_join(va_stop_words)
dat_sentiment <- dat_words_interesting %>% left_join(nrc_lex)
dat_sentiment_count <- dat_sentiment %>%
filter(!is.na(sentiment)) %>%
group_by(sentiment) %>%
summarise(prop.sentiment=n())
dat_sentiment_count <- na.omit(dat_sentiment_count)
dat_sentiment_count <- cbind(dat_sentiment_count[1],
prop.table(data.matrix(dat_sentiment_count[-1]), margin=2))
# dat_sentiment_count$twitter.name <- NA
dat_sentiment_count$twitter.name <- twitter.name
dat_sentiment_count$party <- party
dat_sentiment_count$election.result <- election.result
return(as.data.frame(dat_sentiment_count))
}
Upvotes: 1
Views: 300
Reputation: 2208
Your create_sentiment_df
function returns a data.frame and mapply
simplifies it by default.
If you need the list of data.frames, you can do:
mapply(create_sentiment_df, datt1$twtr_handle, datt1$party, datt1$result, SIMPLIFY = FALSE)
If you require a single data.frame for all your data.frame outputs, use:
do.call(rbind, mapply(create_sentiment_df, datt1$twtr_handle,
datt1$party, datt1$result, SIMPLIFY = FALSE))
Upvotes: 1