Bread
Bread

Reputation: 135

Extract Number of Followers from Twitter using BeautifulSoup

I am trying to automate the process of obtaining the number of followers different twitter accounts using the page source.

I have the following code for one account

from bs4 import BeautifulSoup
import requests
username='justinbieber'
url = 'https://www.twitter.com/'+username
r = requests.get(url)
soup = BeautifulSoup(r.content)
for tag in soup.findAll('a'):
    if tag.has_key('class'):
        if tag['class'] == 'ProfileNav-stat ProfileNav-stat--link u-borderUserColor u-textCenter js-tooltip js-nav u-textUserColor':
            if tag['href'] == '/justinbieber/followers':
                print tag.title
                break

I am not sure where did I went wrong. I understand that we can use Twitter API to obtain the number of followers. However, I wish to try to obtain it through this method as well to try it out. Any suggestions?

I've modified the code from here

Upvotes: 1

Views: 2464

Answers (1)

wpercy
wpercy

Reputation: 10090

If I were you, I'd be passing the class name as an argument to the find() function instead of find_all() and I'd first look for the <li> element that contains the anchor you're loooking for. It'd look something like this

from bs4 import BeautifulSoup
import requests
username='justinbieber'
url = 'https://www.twitter.com/'+username
r = requests.get(url)
soup = BeautifulSoup(r.content)

f = soup.find('li', class_="ProfileNav-item--followers")
title = f.find('a')['title']
print title
# 81,346,708 Followers

num_followers = int(title.split(' ')[0].replace(',',''))
print num_followers
# 81346708

PS findAll() was renamed to find_all() in bs4

Upvotes: 2

Related Questions