Andrea Angeli
Andrea Angeli

Reputation: 141

Tweepy Streaming filter fields

I've this python code that retrieves data from Twitter with Tweepy and Streming APIs and it stops when has found 1000 results (that is 1000 tweets data). It works well but the problem is that when I try to run it on PyCharm, it cuts part of the results. Since the code returns all the data of a tweets (ID, Text, Author ecc) problably it generates too many data and the software crushs. So I'd like te modify the code in order to get only some fields of the twitter data (for eg. I need only the text of the tweet, the author, the date) Any suggestion is appreciated

# Import the necessary package to process data in JSON format
try:
    import json
except ImportError:
    import simplejson as json

# Import the necessary methods from "twitter" library
from twitter import Twitter, OAuth, TwitterHTTPError, TwitterStream

# Variables that contains the user credentials to access Twitter API
ACCESS_TOKEN = ''
ACCESS_SECRET = ''
CONSUMER_KEY = ''
CONSUMER_SECRET = ''


oauth = OAuth(ACCESS_TOKEN, ACCESS_SECRET, CONSUMER_KEY, CONSUMER_SECRET)

# Initiate the connection to Twitter Streaming API
twitter_stream = TwitterStream(auth=oauth)

# Get a sample of the public data following through Twitter
#iterator = twitter_stream.statuses.sample() #SEMPLICE TWITTER STREAMING

iterator = twitter_stream.statuses.filter(track="Euro2016", language="en") #tWITTER STREAMING IN BASE AD UNA TRACK DI RICERCA E AL LINGUAGGIO PER ALTRI SETTAGGI VEDERE https://dev.twitter.com/streaming/overview/request-parameters
#PER SETTARE PARAMETRI RICERCA https://dev.twitter.com/streaming/overview/request-parameters


# Print each tweet in the stream to the screen
# Here we set it to stop after getting 1000 tweets.
# You don't have to set it to stop, but can continue running
# the Twitter API to collect data for days or even longer.
tweet_count = 1000 #SETTAGGIO DI QUANTI RISULTATI RESTITUIRE
for tweet in iterator:
    tweet_count -= 1
    # Twitter Python Tool wraps the data returned by Twitter
    # as a TwitterDictResponse object.
    # We convert it back to the JSON format to print/score
    print(json.dumps(tweet))

    # The command below will do pretty printing for JSON data, try it out
    # print json.dumps(tweet, indent=4)

    if tweet_count <= 0:
        break

Upvotes: 1

Views: 1971

Answers (1)

Siva Umapathy
Siva Umapathy

Reputation: 177

I was able to run this on PyCharm without any issues for 1000 tweets. So try running this on another computer or investigate if you have issues with your existing system.

The result is a python dictionary, so all you need to access individual elements is like below

for tweet in iterator:
    tweet_count -= 1
    #access the elements such as 'text','created_at' ... 
    print tweet['text']

Upvotes: 2

Related Questions