Reputation: 824
I am trying to extract tweets using certain keywords.
My code is:
filterStream(file.name = "tweets.json", track = c('fun', 'arbitrary'),
langauge = 'en', timeout = 1200, oauth = my_oauth)
The problem is that it doesn't returns retweets. I have been looking around a possible solution on the internet, but couldn't find it.
The official documentation talks about the following, but doesn't mentions how to set filterStream()
to retrieve retweet data as well:
file.name = NULL, track = NULL, follow = NULL, locations = NULL, language = NULL, timeout = 0, tweets = NULL, oauth = NULL, verbose = TRUE
Is there something I am missing?
Upvotes: 0
Views: 419
Reputation: 1345
I believe filterStream was written before quoted statuses were introduced, which could explain the discrepancy (the naming convention for retweet may have changed to distinguish between retweets and quoted statuses).
Twitter'S API documentation also warns about the stream API not returning all the same info as the REST API, so the data-tidying functions may set retweets to FALSE by default, expecting Twittter will provide the correct info to override those.
If you dig a little deeper into the API documentation, you may also find that filter levels can be changed to widen or restrict the streaming filter. Though, in my experience, this doesn't result in major differences.
Upvotes: 1
Reputation: 824
The main confusion behind this was question was because when I checked retweeted
column my resulting data, it always showed this:
> head(t16.df$retweeted, 10)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Whereas, when I checked my text
column, it did indeed contained retweets.
> head(t16.df$text)
[1] "It's hard, but here's how to support innovation via @HarvardBiz #innovation #startup #create "
[2] "Do you want to work for someone else your entire life ? #startup #success #entrepreneur #business #inspiration"
[3] "RT @StartGrowthHack: How to start a #startup? What is the process to make it a #success. #Entrepreneur "
[4] "RT @ipfconline1: #Startup: Outbound #Marketing vs #InboundMarketing >> The Best Is to Use a Mix of Both #GrowthHack…"
[5] "Understanding and using marke #FolaDanielSpeaks Call +2348034163006 to Book Fola Daniel to speak, train or compere"
[6] "RT GrowthHackers: The Process of Creating Trello #startup -via biconnections"
Upvotes: 0