Reputation: 63
Very new to R and have started to use the tidytext package.
I'm trying to use arguments to feed into the unnest_tokens
function so I can do multiple column analysis. So instead of this
library(janeaustenr)
library(tidytext)
library(dplyr)
library(stringr)
original_books <- austen_books() %>%
group_by(book) %>%
mutate(linenumber = row_number(),
chapter = cumsum(str_detect(text, regex("^chapter [\\divxlc]",
ignore_case = TRUE)))) %>%
ungroup()
original_books
tidy_books <- original_books %>%
unnest_tokens(word, text)
The last line of code would be:
output<- 'word'
input<- 'text'
tidy_books <- original_books %>%
unnest_tokens(output, input)
But I'm getting this:
Error in check_input(x) : Input must be a character vector of any length or a list of character vectors, each of which has a length of 1.
I've tried using as.character()
without much luck.
Any ideas on how this would work?
Upvotes: 4
Views: 11178
Reputation: 44
I got same issue. I solved this by specifying input as below:
unnest_tokens(input = "events", token = "words", "word")
with "events" is actually my column name.
Upvotes: 0
Reputation: 13118
Try
tidy_books <- original_books %>%
unnest_tokens_(output, input)
with the underscore in unnest_tokens_
.
unnest_tokens_
is the "standard evaluation" version of unnest_tokens
, and allows you to pass in variable names as strings. See Non-standard evaluation for a discussion of standard vs non-standard evaluation.
Upvotes: 5