How can I collect tweets from within the last seven days using rtweet package?

I have started using rtweet package and so far, I have had good results for my queries, languages and geocode parameters. However, I still do not know how can I collect twitter data from within the last 7 days.

For example in the next code chunk I want to extract some data for 7 days but I am not sure if the collected tweets will be since 2017-06-29 until 2017-06-05 or if they will be since 2017-06-22 until 2017-06-29:

Stream all tweets mentioning AMLO or lopezobrador for 7 days

          timeout = 60*60*24*7,
          file_name = "tweetsaboutAMLO.json",
          parse = FALSE)

Read in the data as a tidy tbl data frame

AMLO <- parse_stream("tweetsaboutAMLO.json")

Do you know if there are any commands in rtweet to specify the time frame to use when using the search_tweets() or stream_tweets() functions?

Upvotes: 0

Views: 1879

Answers (2)

Nicol&#225;s Velasquez
Nicol&#225;s Velasquez

Reputation: 5898

So, to answer your question about gow to write it more efficiently, you could try a for loop or a list apply. Here I show the for loop.

First, create a list with the 4 dates you are calling.

fechas <- seq.Date(from = as.Date("2018-06-24"), to = as.Date("2018-06-27"), by =  1)

Then create an empty data.frame to store your tweets.

df_tweets <- data.frame()

Now, loop along your list and populate the empty data.frame.

for (i in seq_along(fechas)) {
 df_temp <-  search_tweets("lang:es",
                        geocode = mexico_coord,
                        until= fechas[i],
                        n = 100)
 df_tweets <- rbind(df_tweets, df_temp)


On the other hand, the following solution might be more convenient and efficient altogether:

f_tweets2 <- search_tweets("lang:es",
                         geocode = mexico_coord,
                         until= "2018-06-29", ## or latest date                            
                        n = 10000)
df_tweets2 %>% 
  group_by(as.Date(created_at)) %>%  ## Group (or set apart) the tweets by date of creation
  sample_n(100)   ## Obtain 100 random tweets for each group, in this case, for each date.

Upvotes: 1

I already found a wat to collect tweets within the past seven days. However, it is not efficient.

rt_24 <- search_tweets("lang:es", 
                       geocode = mexico_coord, 
                       n = 100)

rt_25 <- search_tweets("lang:es",
                       geocode = mexico_coord,
                       n = 100)

rt_26 <- search_tweets("lang:es",
                       geocode = mexico_coord,
                       n = 100)

rt_27 <- search_tweets("lang:es",
                       geocode = mexico_coord,
                       n = 100)

Then, append the dataframes


Do you know if there is a more efficient way to write this? Maybe using the max_id() function in combination with until ?

Upvotes: 0

Related Questions