tonia
tonia

Reputation: 23

crawling twitter using urllib instead of twitter api

I want to crawl data from twitter. I'm using twitter api, but restricted by the rate limit, it runs very slow. Alternatively, I can bypass twitter api by parsing the url directly, eg. urllib package. But that is all I know.

Could you guys provide more helps about how to crawl timeline and following data from twitter without using twitter api? Do you have any suggestions? Thanks in advance.

PS: I'm using Python for programming.

Upvotes: 2

Views: 1613

Answers (1)

cemcnaughton
cemcnaughton

Reputation: 38

You will need to use BeautifulSoup for this process.

from BeautifulSoup import BeautifulSoup as soupy
from urllib import urllib
html = urllib.urlopen(YOUR_TWITTER_URL).read()
soup = soupy(html)
for tweet in soup.find('ol',attrs={'class':'stream-items'}).findAll('li'):
     print tweet.find('p').text

Upvotes: 1

Related Questions