Reputation: 861
So I'm working with Python and the Twitter API, using Tweepy and Twitter's Stream API, which returns Tweet objects in real-time. Part of my app which queries a different API doesn't play nice with URLS in the tweet text, so I'm using the Python re
module to replace them with a harmless identifier string. However, I'm having trouble finding the urls that need to be parsed out of the text. Instead of having to search through the text myself for URLS, I decided to use the ones that the API delivers and do a "find and replace" in the text.
Here is the documentation on what the API gives me. It gives a t.co url, a display url, and a fully expanded url. The problem with just using the t.co url is that twiter doesn't automatically convert all urls in tweets to t.co, only ones past a certain length. This means that the t.co url isn't always the same one that appears in the tweet text.
So I need to figure out how to get, from the API, the version of the URL which actually appears in the text of the tweet.
Thanks! evamvid
Upvotes: 0
Views: 2105
Reputation: 15953
Try using this for the extended_url
:
tweet_url = str(tweet.expanded_url) # you might not need str(),
#test it yourself if you'd like.
# Replace tweet by the loop/function you have the json extracted with
tweet_url = tweet_url.replace('\\', '')
print(tweet_url)
That should you give you the link without the way you want it.
Upvotes: 1