J.Q
J.Q

Reputation: 1031

create flag based on row values in grep()

I have a 10-row data frame of tweets about potatoes and need to flag them based on the punctuation each tweet contains (questions marks or exclamation points). The grep function will return row numbers where these characters appear:

grep("\\?", potatoes$tweet)
grep("!", potatoes$tweet)

I've tried to create the flag variable question with mutate in dplyr as shown...

potatoes$question <- NA
potatoes <- mutate(potatoes, question = +row_number(grep("\\?", potatoes$tweet)))

Error in mutate_impl(.data, dots) : 
Column `question` must be length 10 (the number of rows) or one, not 3

I'm also happy to consider more elegant solutions than conditioning on the output of grep. Any help appreciated!

Upvotes: 1

Views: 266

Answers (1)

akrun
akrun

Reputation: 887048

We can use grepl instead of grep as grep returns the index/position where the matches occurs, whereas grepl returns a logical vector where TRUE denotes the matching element and FALSE non-matching. It can be used as a flag

i1 <- grepl("!", potatoes$tweet)

and if we need to change to row numbers,

potatoes$question <- i1 * seq_len(nrow(potatoes$sweet))

Similarly, grep with row index can be used for assignment

i2 <- grep("!", potatoes$tweet)
potatoes$question[i2] <- seq_len(nrow(potatoes))[i2]

Upvotes: 2

Related Questions