Reputation: 43
Ok, so I am using the Tweepy interface for the Twitter API for aggregating tweets of a particular query term for the purpose of Sentiment Analysis on it in real time. My objective is to search tweets per hour of each day for the past 7 days on the given query term and analyze how sentiment has varied over time. Each search request returns 100 tweets.
As I understand, the Twitter API provides since
and until
attributes to specify in the search query where two different dates can be entered and tweets are fetched within the given dates. However, it doesn't seem to work with any other time periods (like hours or minutes). Is there any way the latter can be done?
Bonus Question: During a search, 75% of the tweets fetched are retweets of the same tweet. I have to remove the duplicate tweets after fetching them all by checking the retweeted_status
attribute of each tweet. Is there any provision in the API that removes the retweets in the server side itself before fetching them so I get more relevant data?
Upvotes: 2
Views: 532
Reputation: 101
To the bonus question, yes you can filter retweets at the API level as per the Twitter API documentation
https://developer.twitter.com/en/docs/tweets/rules-and-filtering/overview/standard-operators
Simple add it to your query before you pass to the Cursor.
query="search_this -filter:retweets"
Relevant StackOverflow question
Tweepy - Exclude Retweets
Upvotes: 0