Preventing a python twitter bot from ever posting duplicate status updates

Question

I’m just dipping my toes into Python right now and I learn best (albeit, non-efficiently) with a project. My current project is a twitter bot that scrapes a government website for the latest COVID-19 case counts in my jurisdiction and tweets them out, building off this awesome tutorial.

Functionally it is working, but I want to finesse it so that it only posts when that data is updated and new. Otherwise, it’s just an account that posts the same information every day rather than a news account.

I thought the built-in rules in the Twitter API that don’t allow duplicate tweets would work automatically to filter out old information. Sometimes it does work, but the rule isn’t strict enough- it appears the account can still post duplicates as long as it doesn’t do it too often. Ideally, I’d like to make it more strict in my code. It would need to compare the new text to the last tweet, and only tweet if there was a difference in the text.

Can anyone give me some guidance on if this is possible, and how best to get it done? I’m at a stage in my coding that I’m not sure what terms to use in my search to find a solution.

Here’s the current code as it stands:

import sys
from config import CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET
import tweepy

import requests
from lxml import html
from threading import Timer

def create_tweet():
    response = requests.get('https://yukon.ca/en/case-counts-covid-19')
    doc = html.fromstring(response.content)
    A, B, C, D, E, F = doc.xpath('//table[@class="table"]//td[2]//text()')

    tweet = f'''Yukon COVID-19 cases count
Total people tested: {A}
Confirmed cases: {B}
Recovered cases: {C}
Deaths: {D}
Negative results: {E}
Pending results: {F}

Data from: https://yukon.ca/en/case-counts-covid-19
'''
    return tweet


if __name__ == '__main__':
    auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
    auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)

    # Create API object
    api = tweepy.API(auth)

    try:
        api.verify_credentials()
        print('Authentication Successful')
    except:
        print('Error while authenticating API')
        sys.exit(1)

    tweet = create_tweet()
    api.update_status(tweet)
    print('Tweet successful')

Preventing a python twitter bot from ever posting duplicate status updates

Answers (1)

Related Questions