Nazer
Nazer

Reputation: 3764

Scrape images from tweets using R

I would love to make a twitter-blogdown blog of images that some one posts, but I'm not sure it is even possible. I used 'twitteR' to scrape all the posts from one person, but it looks like I would have to do something completely different to get images instead of text.

Any advice on what direction to take would be appreciated.

Upvotes: 2

Views: 1639

Answers (1)

neilfws
neilfws

Reputation: 33782

Quite a broad question, but here are some ideas.

First: I recommend using the rtweet package. In my experience it makes authentication much easier and returns data in easy-to-use structures.

As an example, here's how I'd fetch my own last 100 tweets after setting up authentication as described in the package documentation:

library(rtweet)
library(dplyr)

neilfws <- get_timeline("neilfws", n = 100)
neilfws %>%
  glimpse()

The column media_id indicates whether a tweet has attached media, value = NA if not. So a quick count of how many rows have media:

neilfws %>%
  filter(!is.na(media_id) %>%
  nrow()

The link to the media is in the column media_url. So here are the first 6 images associated with my tweets:

neilfws %>% 
  filter(!is.na(media_id)) %>% 
  select(media_url) %>% 
  head()

1 http://pbs.twimg.com/media/DHzGbvyVoAAm8in.jpg
2 http://pbs.twimg.com/media/DHfc4idV0AA6qyc.jpg
3 http://pbs.twimg.com/media/DHfNamEVYAA5H_U.jpg
4 http://pbs.twimg.com/media/DHYuG1oUwAADV-z.jpg
5 http://pbs.twimg.com/media/DHQlEQqUAAAHoCK.jpg
6 http://pbs.twimg.com/media/DHLG_ESUMAAMURj.jpg

Now you have the media URLs, you can work on the code to retrieve or display them.

Upvotes: 6

Related Questions