wrahool
wrahool

Reputation: 1141

Getting more than the number of friends allowed by Twitter API using rtweet

I have written the following script that fetches friends of Twitter users ("barackobama" in this example) in batches of 75,000 (5000 friends per API call x 15 API calls) every 15 minutes using rtweet. However, after the script is done running, I find that the friend ids repeat after a fixed interval. For instance, rows 1, 280001, and 560001 have the same ID. Rows 2, 280002, and 560002 have the same ID, and so on. I'm wondering if I'm understanding next_cursor in the API incorrectly.

u = "barackobama"
n_friends = lookup_users(u)$friends_count
curr_page = -1
fetched_friends = 0
i = 0
all_friends = NULL

while(fetched_friends < n_friends)  {

  if(rate_limit("get_friends")$remaining == 0) {
    print(paste0("API limit reached. Reseting  at ", rate_limit("get_friends")$reset_at))
    Sys.sleep(as.numeric((rate_limit("get_friends")$reset + 0.1) * 60))
  }

  curr_friends = get_friends(u, n = 5000, retryonratelimit = TRUE, page = curr_page)
  i = i + 1
  all_friends = rbind(all_friends, curr_friends)
  fetched_friends = nrow(all_friends)
  print(paste0(i, ". ", fetched_friends, " out of ", n_friends, " fetched."))
  curr_page = next_cursor(curr_friends)
}

Any help will be appreciated.

Upvotes: 0

Views: 150

Answers (1)

Terence Eden
Terence Eden

Reputation: 14334

You are not doing anything wrong. From the documentation:

this ordering is subject to unannounced change and eventual consistency issues

For very large lists, the API simply won't return all the information you want.

Upvotes: 1

Related Questions