gogolaygo
gogolaygo

Reputation: 325

str_extract_all returns a list but I want a vector

Still relatively new to R here. I have a column of tweets, and I'm trying to create a column that contains the retweet handle "RT @blahblah", like this:

Tweets                            Retweetfrom
RT @john I had a good day         RT @john
RT @josh I had a bad day          RT @josh

This is my code:

r$Retweetfrom <- str_extract_all(r$Tweets, "^RT[:space:]+@[:graph:]+")

It's giving me the result alright, but instead of a vector, the new column is a list. When I try to unlist it, it throws me an error:

Error in `$<-.data.frame`(`*tmp*`, "Retweetfrom", value = c("@AlpineITW", "@AllScienceGlobe",  : replacement has 1168 rows, data has 2306

Anyone know how to deal with this? Thanks a lot.

Upvotes: 5

Views: 3808

Answers (3)

user29035994
user29035994

Reputation: 1

You can also just unlist post extraction:

r$Retweetfrom <- str_extract_all(r$Tweets, "^RT[:space:]+@[:graph:]+") %>% 
unlist()

Upvotes: 0

akrun
akrun

Reputation: 887541

If we are interested in a base R option, sub will be useful

r$Retweetfrom <- sub(".*\\b(RT\\s+@[[:graph:]]+)\\s+.*", 
                         "\\1", r$Tweets)
r$Retweetfrom
#[1] "RT @john" "RT @josh"

Upvotes: 3

Jonathan Carroll
Jonathan Carroll

Reputation: 3947

Assuming there's just one RT @user in each of row of the Tweets column (not a very strong assumption) then you may only want str_extract (which will vectorise over the strings) not str_extract_all (which may return multiple results per row). i.e.

r$Retweetfrom <- str_extract(r$Tweets, "^RT[:space:]+@[:graph:]+")

in which case you will get the first mention of RT @user, which is probably the one you want anyway.

Upvotes: 3

Related Questions