Reputation: 1031
I have a 10-row data frame of tweets about potatoes
and need to flag them based on the punctuation each tweet
contains (questions marks or exclamation points). The grep
function will return row numbers where these characters appear:
grep("\\?", potatoes$tweet)
grep("!", potatoes$tweet)
I've tried to create the flag variable question
with mutate
in dplyr as shown...
potatoes$question <- NA
potatoes <- mutate(potatoes, question = +row_number(grep("\\?", potatoes$tweet)))
Error in mutate_impl(.data, dots) :
Column `question` must be length 10 (the number of rows) or one, not 3
I'm also happy to consider more elegant solutions than conditioning on the output of grep
. Any help appreciated!
Upvotes: 1
Views: 266
Reputation: 887048
We can use grepl
instead of grep
as grep
returns the index/position where the matches occurs, whereas grepl
returns a logical vector
where TRUE denotes the matching element and FALSE non-matching. It can be used as a flag
i1 <- grepl("!", potatoes$tweet)
and if we need to change to row numbers,
potatoes$question <- i1 * seq_len(nrow(potatoes$sweet))
Similarly, grep
with row index can be used for assignment
i2 <- grep("!", potatoes$tweet)
potatoes$question[i2] <- seq_len(nrow(potatoes))[i2]
Upvotes: 2