401 Error when Webscraping LinkedIn with BeautifulSoup

Question

I am trying to use Python's BeautifulSoup library to extract HTML from my LinkedIn "Recently Added Connections" Page. Specifically, I want the name of the most recent connection - it appears towards the top of the page.

When I inspect the HTML for this specific section, what I see wrapping the content is:


      Bob McBobface

However, the HTML I get back with BeautifulSoup is disappointing:

{"request":"/voyager/api/configuration","status":200,"body":"bpr-guid-3322365"}

{"status":401}

I've tried fiddling with the Requests library, but to no avail. I'm a beginner, so I'm hoping I don't need to spend a few weeks learning about OAuth or Selenium.

Here's my code:

from bs4 import BeautifulSoup
import urllib.request

url = "https://www.linkedin.com/mynetwork/invite-connect/connections/"
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, 'html.parser')
#print(soup)
content_list = soup.find_all('span',class_="mn-connection-card__name t-16 t-black t-bold")
print(content_list)

Running this returns an empty list: [], whereas I would expect: "Bob McBobface".

When I print(soup), it just returns a short HTML blurb with the 401-Error notice you see above.

Any advice?

401 Error when Webscraping LinkedIn with BeautifulSoup

Answers (1)

Related Questions