Reputation: 30993
Let's say I have a dataframe:
word <- c("good", "great", "bad", "poor", "eh")
userid <- c(1, 2, 3, 4, 5)
d <- data.frame(userid, word)
I want to add a dataframe column, sentiment
, that is a factor
and depends on what word
is:
words_pos <- c("good", "great")
words_neg <- c("bad", "poor")
calculate_sentiment <- function(x) {
if (x %in% words_pos) {
return("pos")
} else if (x %in% words_neg) {
return("neg")
}
return(NA)
}
d$sentiment <- apply(d, 1, function(x) calculate_sentiment(x['word'])
However, now d$sentiment
is of type "character". How do I make it a factor with the right levels? pos
, neg
, NA
-- I'm not even sure if NA
should be a factor level, as I'm just learning R.
Thanks!
Upvotes: 1
Views: 13313
Reputation: 3947
This isn't going to be the simplest way to do it, but it's a very readable way (in my opinion, preferable to using an abstracted function)... using dplyr
's mutate
along with case_when
:
library(dplyr)
d2 <- mutate(d, sentiment = factor(case_when(word %in% words_pos ~ "pos",
word %in% words_neg ~ "neg",
TRUE ~ NA_character_)))
glimpse(d2)
#> Observations: 5
#> Variables: 3
#> $ userid <dbl> 1, 2, 3, 4, 5
#> $ word <fctr> good, great, bad, poor, eh
#> $ sentiment <fctr> pos, pos, neg, neg, NA
I've spaced it out a bit so it's clearer, but this will:
data.frame
d
thenmutate
(change a column) 'sentiment' to be equal to a factor, defined bycase
statement with logicals on the LHS, results on the RHS (NA_character_
required so that everything is the same type).Output confirms that this is a factor
column with the desired values.
Upvotes: 4
Reputation: 167
You can add as.factor
to the last line of the code. Which will give factors of pos and neg. BTW NA is not a factor.
d$sentiment <-as.factor(apply(d, 1, function(x) calculate_sentiment(x['word'])))
Upvotes: 1