Filtering Twitter data using Tweepy

Question

I've used Marco Bonzanini's tutorial on mining Twitter data : https://marcobonzanini.com/2015/03/02/mining-twitter-data-with-python-part-1/

class MyListener(StreamListener):

    def on_data(self, data):
        try:
            with open('python.json', 'a') as f:
                f.write(data)
                return True
        except BaseException as e:
            print("Error on_data: %s" % str(e))
        return True

    def on_error(self, status):
        print(status)
        return True

and used the "follow" parameter of the filter method to retrieve the tweets produced by this specific ID :

twitter_stream = Stream(auth, MyListener())
twitter_stream.filter(follow=["63728193"#random Twitter ID])

However, it does not seem to fulfill the mission since it not only returns the tweets & retweets created by the ID, but also every tweet wherein the ID is mentioned (i.e. retweets). That is not what I want.

I'm sure there must be a way to do it since there is a "screen_name" field in the json file given by Twitter. That screen_name field gives the name of the creator of the Tweet. I just have to find how to filter the data on this screen_neame field.

asongtoruin · Accepted Answer

This behaviour is by design. To quote the Twitter streaming API docs:

For each user specified, the stream will contain:

Tweets created by the user.

Tweets which are retweeted by the user.

Replies to any Tweet created by the user.

Retweets of any Tweet created by the user.

Manual replies, created without pressing a reply button (e.g. “@twitterapi I agree”).

The best way for you to process it for your purposes is to check who created the tweet as it is received, which I believe can be done as follows:

class MyListener(StreamListener):
    def on_data(self, data):
        try:
            if data._json['user']['id'] == "63728193":
                with open('python.json', 'a') as f:
                    f.write(data)
        except BaseException as e:
            print("Error on_data: %s" % str(e))
        return True

    def on_error(self, status):
        print(status)
        return True

Filtering Twitter data using Tweepy

Answers (1)

Related Questions