Reputation:
I'm trying to pull the text of a URL from a twitter feed--about 3,000 of them--via the twitteR package in R. Specifically, I want the longitude and latitude data contained in the URLs in this tweet: https://twitter.com/PGANVACentralCh/status/885702041275969536
However, the twitteR package scrapes out the short form URL destination instead: e.g.: https://t dot co slash Y0pGeSiVFJ
I could follow all 3,000 links individually and copy and paste their URLs and then transform them to longitude and latitude, but there has to be a simpler way?
Not that it matters for this particular problem, but I am getting the tweets via this code:
#
library(twitteR)
library(httr)
#
poketweets <- userTimeline("PGANVACentralCh", n = 3200)
poketweets_df <- tbl_df(map_df(poketweets, as.data.frame))
write.csv(poketweets_df, "poketweets.csv")
Upvotes: 0
Views: 137
Reputation:
You need to get hold of the entities.url.expanded_url
value from the Tweet object. I do not believe that the status objects returned by twitteR support that (the status object fields are only a subset of the Tweet JSON values). Additionally, twitteR is now deprecated in favour of rtweet.
Using rtweet, you can modify your code:
poketweets <- get_timeline("PGANVACentralCh", n = 50)
head(poketweets)
You'll find there's a urls_expanded field in each Tweet dataframe that you can use.
Upvotes: 1