Reputation: 23
I want to crawl data from twitter. I'm using twitter api, but restricted by the rate limit, it runs very slow. Alternatively, I can bypass twitter api by parsing the url directly, eg. urllib package. But that is all I know.
Could you guys provide more helps about how to crawl timeline and following data from twitter without using twitter api? Do you have any suggestions? Thanks in advance.
PS: I'm using Python for programming.
Upvotes: 2
Views: 1613
Reputation: 38
You will need to use BeautifulSoup for this process.
from BeautifulSoup import BeautifulSoup as soupy
from urllib import urllib
html = urllib.urlopen(YOUR_TWITTER_URL).read()
soup = soupy(html)
for tweet in soup.find('ol',attrs={'class':'stream-items'}).findAll('li'):
print tweet.find('p').text
Upvotes: 1