Reputation: 63
I'd like to look up the profiles of a user's Twitter followers using R (followers > 100000). Although twitteR is a great package, it has problems when dealing with high levels of followers as one needs to implement a sleep routine to avoid exceeding the rate limits. I am a relative novice here and wondered how one might loop through the follower ID object, entering in follower ids in batches of 100 (as this is the max the Twitter API can process at a time)?
Edit: code added (twitteR) library(plyr) maxTwitterIds = 100 sleeptime = 500 # sec
user<-getUser("[username]")
followers<-zz$getFollowerIDs()
ids_matrix = matrix(zz, nrow = maxTwitterIds, ncol = length(zz) / maxTwitterIds)
followers<-zz$getFollowerIDs()
#note: for smaller lists of followers it is possible to use the command "lookupUsers(zz) at this point
foll<-getTwitterInfoForListIds = function(id_list) {
return(lapply(id_list,
names <- sapply(foll,name)
sn<sapply(foll,screenName)
id<-sapply(foll,id)
verified<-sapply(foll,erified)
created<-sapply(foll,created)
statuses<-sapply(foll,statusesCount)
follower<-sapply(foll,followersCount)
friends<-sapply(foll,friendsCount)
favorites<-sapply(foll,favoritesCount)
location<-sapply(foll,location)
url<-sapply(foll,url)
description<-sapply(foll,description)
last_status<-sapply(foll,lastStatus)))
}
alldata = alply(, 2, function(id_set) {
info = getTwitterInfoForListIds(id_set)
Sys.sleep(sleeptime)
return(info)
})
Upvotes: 5
Views: 4526
Reputation: 66490
This can also be done using the newer rtweet
package.
Per the example here: https://github.com/mkearney/rtweet
# Get followers
# Retrieve a list of the accounts following a user.
## get user IDs of accounts following CNN
cnn_flw <- get_followers("cnn", n = 75000)
# lookup data on those accounts
cnn_flw_data <- lookup_users(cnn_flw$user_id)
# Or if you really want ALL of their followers:
# how many total follows does cnn have?
cnn <- lookup_users("cnn")
# get them all (this would take a little over 5 days)
cnn_flw <- get_followers( "cnn", n = cnn$followers_count,
retryonratelimit = TRUE )
Upvotes: 1
Reputation: 60934
Let me first start by telling that I have not used the twitteR package. Therefore, I can only provide you with some pseudo code which tells you the structure of how to do this. That should get you started.
library(plyr)
# Some constants
maxTwitterIds = 100
sleeptime = 1 # sec
# Get the id's of the twitter followers of person X
ids = getTwitterFollowers("x") # I'll use ids = 1:1000
ids_matrix = matrix(ids, nrow = maxTwitterIds,
ncol = length(ids) / maxTwitterIds)
getTwitterInfoForListIds = function(id_list) {
return(lapply(id_list, getTwitterInfo))
}
# Find the information you need from each id
alldata = alply(ids_matrix, 2, function(id_set) {
info = getTwitterInfoForListIds(id_set)
Sys.sleep(sleeptime)
return(info)
})
Maybe the datastructure you get out of this needs some polishing (it is a nested list), but without information about what you want to extract from the twitter accounts that is hard to say.
Upvotes: 0