Reputation: 95
So, I have an script that uses curl_download to download a Twitter page, and then use read_html to get some data off of it. It used to work fine, but now, instead of downloading the proper Twitter page, it downloads this page instead:
I'm not sure how Curl would have the wrong browser, or how to change it if it does, but this is a very new problem. The reason I am doing this is so the script can grab the number of followers from the .html file (and do a bunch of other irrelevant things with it), so if anyone just happens to know a significantly easier way to do that I am open, but otherwise I'm hoping someone has seen this Curl issue.
Here is my code:
library(curl)
twitter_file <- "location the file is meant to be saved"
curl_download("https://twitter.com/SelectFulton", twitter_file, quiet = TRUE)
Thank you!
Upvotes: 0
Views: 431
Reputation: 95
@r2evans was correct about changing the user agent working! This was the code I ended up using:
withr::with_options(list(HTTPUserAgent="Googlebot/2.1 (+http://www.google.com/bot.html)"), curl_download("https://twitter.com/SelectFulton", twitter_file, quiet = TRUE))
and there are no longer any issues. Thanks for the help!
Upvotes: 1