user6100022
user6100022

Reputation:

Scraping Twitter data using BeautifulSoup

I have tried to scrape twitter data using BeautifulSoup and requests library. I tried to log in first using BeautifulSoup and then scrape the required page. But it is not working. I didn't get the mistake what I have done.

I am adding this code:

import requests
from bs4 import BeautifulSoup
session_rqst=requests.session()
url="https://twitter.com/login"
r=requests.get(url)
c=r.content
soup=BeautifulSoup(c,"html.parser")
token=soup.find("input",{"name":"authenticity_token"})
payload = {"username": "test_user", "password": "test_password"}
result=session_rqst.post(url, data=payload, headers = 
dict(referer="https://twitter.com/"))
all=result.content
soup1=BeautifulSoup(all,"html.parser")
page=requests.get("https://twitter.com/akhiltaker/following")
page.content
soup1=BeautifulSoup(page.content,"html.parser")

How I can scrape followers list from the webpage?

Upvotes: 1

Views: 5060

Answers (1)

bastelflp
bastelflp

Reputation: 10096

Instead of scraping twitter via requests and BeautifulSoup manually, use the twitter API.

You can get the followers of an account directly, see the docs here: api-reference/get-followers-list which gives you your data as json.

There are various Python libraries for twitter, which you can use to for your purpose.


Edit: Regarding your question to BeautifulSoup: It looks like you cannot login to twitter that simple, so your response probably contains only some login/error page, but not your follower list. Have a look at this answer on how to login to twitter via Python.

Upvotes: 0

Related Questions