ss_0708
ss_0708

Reputation: 193

Scraping from javascript in HTML tags using beautifulsoup

I am trying to scrape the names from all the alphabets (Ato Z and also 0-9) of this website http://www.smfederation.org.sg/membership/members-directory

But the names seems to be hidden in href ="javascript:void(0)"

Below is my code

import requests 
from bs4 import BeautifulSoup
url = "http://www.smfederation.org.sg/membership/members-directory"
for item in url:
    detail = requests.get(item)
    soup = BeautifulSoup(detail.content, 'html.parser')

I have no idea how to approach javascript with in HTML. What should i add to the above code to fetch the names of all listings?

Upvotes: 0

Views: 50

Answers (1)

Selcuk
Selcuk

Reputation: 59184

You are scraping the wrong url. Open the inspector of your browser, go to the Network tab and you will see that the names are at http://smfederation.org.sg/account/getaccounts

It's in json format, so it will automatically be a Python dictionary when you load it using the .json() method of the response object returned by requests:

>>> import requests
>>> accounts = requests.get("http://www.smfederation.org.sg/account/getaccounts").json()
>>> accounts["data"][0]["accountname"]
'OPTO-PRECISION PTE LTD'

You can also get all accounts using a for loop, such as:

for account in accounts["data"]:
    print(account["accountname"])

Upvotes: 1

Related Questions