Stephen Younger
Stephen Younger

Reputation: 163

python twitter api most recent results

How to get 1500 tweets? I tried the page parameter and found out, that this is not working and I am now stuck with max_id and since_id. I don't know max_id and since_id. If a make a query I would like to get the most recent 1500 tweets since the query was send. Here is my code:

# -*- coding: utf-8 -*-
import urllib
import simplejson

def searchTweets(query):
 search = urllib.urlopen("http://search.twitter.com/search.json?q="+query)
 dict = simplejson.loads(search.read())
 counter = 0
 for result in dict["results"]: 
  print "*",result["text"].encode('utf-8')
  counter += 1
 print "\n",counter," tweets found","\n" 

searchTerm = "steak"
searchTweets(searchTerm+"&rpp=100&page=15")

Does anyone know a solution?

Upvotes: 1

Views: 533

Answers (1)

crunkchitis
crunkchitis

Reputation: 738

Got this working for me for 1200 tweets:

# -*- coding: utf-8 -*-
import urllib
import simplejson

def searchTweets(query, minimum_tweets):
  results = []
  i=0
  while len(results)<minimum_tweets:
    if i==0: # First time through don't include max id
        response = urllib.urlopen("http://search.twitter.com/search.json?q="+query+"&rpp=100")
    else: # Subsequent times include max id
        response = urllib.urlopen("http://search.twitter.com/search.json?q="+query+"&rpp=100&max_id="+max_id)
    response = simplejson.loads(response.read())
    if not response['results']: break # Break if no tweets are returned
    max_id = str(long(response['results'][-1]['id_str'])-1) # Define max_id for next iteration
    results.extend(response['results']) # Extend tweets to results array
    i += 1

  print "\n",len(results)," tweets found","\n" 

searchTerm = "steak"
searchTweets(searchTerm, 1200)

The problem with it is that the search twitter API breaks pretty often and there's no error handling, or retries here. But it should show you the logic behind the max_id. I make the max_id one less than the id of the last tweet that was pulled, so there aren't any repeats.

Also, there are definitely more elegant ways to decide whether or not to include max_id in the url. This solution was because max_id doesn't have a default value (which I was hoping for :p)

Upvotes: 1

Related Questions