Dave
Dave

Reputation: 980

User lookup on Twitter API from R results in error (403)

Using the Twitter API and the twitteR-package, I am trying to retrieve the user objects for a long list of names (between 50.000 and 100.000).

I keep getting the following error:

Error in twInterfaceObj$doAPICall(paste("users", "lookup", sep = "/"),  : 
  client error: (403) Forbidden

The error code supposedly hints at "update limits". But the rate limit on user lookups is 180 and lookups are performed in batches of 100 user names. Therefore up to 18.000 users shouldn't be a problem. But even reducing the number to 6000 (to respect the limit on requests via application-only auth) per 15 minute time window results in the same error.

Here is a MWE (for which you do, however, need your own API-keys):

library(plyr)
# install the latest versions from github:
# devtools::install_github("twitteR", username="geoffjentry")
# devtools::install_github("hadley/httr")
library(twitteR)
library(httr)    

source("TwitterKeys.R") # Your own API-Keys
setup_twitter_oauth(consumerKey, consumerSecret, accessToken, accessSecret)

# The following is just to generate a large enough list of user names:
searchTerms <- c("worldcup", "economy", "climate", "wimbledon", 
                 "apple", "android", "news", "politics")

# This might take a while
sample <- llply(searchTerms, function(term) {
  tweets <- twListToDF(searchTwitter(term, n=3200))
  users <- unique(tweets$screenName)
  return(users)
})

userNames <- unique(unlist(sample))

# This function is supposed to perform the lookups in batches 
# and mind the rate limit:
getUserObjects <- function(users) {
  groups <- split(users, ceiling(seq_along(users)/6000))
  userObjects <- ldply(groups, function(group) {
    objects <- lookupUsers(group)
    out <- twListToDF(objects)
    print("Waiting for 15 Minutes...")
    Sys.sleep(900)
    return(out)
  })
  return(userObjects)
}

# Putting it into action:
userObjects <- getUserObjects(userNames)

Sometimes looking up smaller subsets manually e.g. via lookupUsers(userNames[1:3000]) works; when I try to automate the process, however, the error gets thrown.

Does anyone have an idea what the reason for this might be?

Upvotes: 9

Views: 2556

Answers (2)

calder-ty
calder-ty

Reputation: 456

I know this question is old, but I had this issue recently and couldn't find any responses that adequately solved the problem.

BOTTOM LINE UP FRONT:

Adding a tryCatch() error handeling system and splitting the call up into two smaller calls of 50 id's fixed the issue.

LONG STORY

For me i noticed that the API Seemed to fail at the same point (around the 4,100th id) After adding some error handling I was able to identify around 8 sections of 100 in my list of ID's that didn't work. However when using the twitter API Console, those id's worked. I went through the code in github, but couldn't find a reason why it should break. experimentation found that splitting the call in two works perfectly. Here's a sample of code that works.

N <- NROW(Data)      # Keeps track of how many more id's we have
count <- 1           # Keeps track of which ID we are at
Len <- N             # so we don't index out of range (see below)
Stop <- 0            # Contains the value that we should Stop each batch at
j = 0                # Keeps Track of how many calls made
while (N > 0 && j <= 180) {

    tryCatch({
    
    # Set The Stop value so that if we hit the end of the list it doesn't
    # Give a value that is out of range
    Stop <<- min(c(count + 99, Len))
    
    # Keep track of how many calls we have made
    j = j + 1   
    User_Data <- lookupUsers(Data$user_id_str[count:Stop], includeNA = TRUE)

    #... CODE THAT STORES DATA AS NEEDED
    
    # Update for next iteration
    N <<- N - 100
    count <<- count + 100
    message(paste("Users Searched: ", (count-1), "/", Len))

    },

    error = function(e) {
  
      message("Twitter sent back 403 error, Trying again with half as many tweets")
      Stop <<- min(c(count + 49, Len))
  
      j <<- j + 1
      # FIRST SECOND TRY 
      User_Data <- lookupUsers(Data$user_id_str[count:Stop], includeNA = TRUE)
  
      #... CODE THAT STORES DATA AS NEEDED
      N <<- N - 50
      count <<- count + 50
      message(paste("Users Searched: ", Stop, "/", Len))
  
      Stop <<- min(c(count + 49, Len))
  
      j <<- j + 1
      # SECOND SECOND TRY
      User_Data <- lookupUsers(Freelancers$user_id_str[count:Stop], includeNA = TRUE)
  
      #... CODE THAT STORES DATA AS NEEDED
      N <<- N - 50
      count <<- count + 50
      message(paste("Users Searched: ", Stop, "/", Len))
    })

}

Upvotes: 1

orange1
orange1

Reputation: 2939

According to this answer I hit the rate limit for twitteR even from the first request , there are not only limits on the total number of users, but also on the number of calls per 15 minute interval. If each call has 100 users, and you are trying to look up 6000 users, you should be making 60 calls, which is more than the 15 you are allowed. Try putting the program to sleep and having it send out a call again after 15 minutes are up.

Upvotes: 1

Related Questions