karim1104
karim1104

Reputation: 23

Getting html table with BeautifulSoup

I can't seem to be able to get the row data from the table on this page: https://www.saudiexchange.sa/wps/portal/tadawul/market-participants/issuers/issuers-directory?locale=en

Here's my code:

url = requests.get('https://www.saudiexchange.sa/wps/portal/tadawul/market-participants/issuers/issuers-directory?locale=en')
soup = BeautifulSoup(url.content, 'html.parser')
tables = soup.find_all('table', attrs={"id": "companiesListTable"})
tables

Where am I going wrong?

Upvotes: 0

Views: 134

Answers (2)

Bhavya Parikh
Bhavya Parikh

Reputation: 3400

Table you are looking for its dynamically loaded so you can call endpoint from Network Tab

How to find:

Go to chrome developer mode find Network tab and reload website also you can find data copy paste data in search also and in the last URL containing your data as in json format.

import requests
import pandas as pd

url = requests.get('https://www.saudiexchange.sa/wps/portal/tadawul/market-participants/issuers/issuers-directory/!ut/p/z1/04_Sj9CPykssy0xPLMnMz0vMAfIjo8zi_Tx8nD0MLIy8DTyMXAwczVy9vV2cTY0MnEz1w8EKjIycLQwtTQx8DHzMDYEK3A08A31NjA0CjfWjSNLv7ulnbuAY6OgR5hYWYgzUQpl-AxPi9BvgAI4GhPVHgZXgCwFUBVi8iFcByA9gBXgcWZAbGhoaYZDpma6oCABqndOv/p0/IZ7_NHLCH082KOAG20A6BDUU6K3082=CZ6_NHLCH082K0H2D0A6EKKDC520B5=N/?sectorID=All&_=1630218635227')
main_data=url.json()['data']
df=pd.DataFrame(main_data)

Outptut:

symbol       lonaName                            shortName  Acronym  isinCode
0   1330    Abdullah A. M. Al-Khodari Sons Co.  ALKHODARI   ALKHODARI   SA12L0O0KP12
1   4001    Abdullah Al Othaim Markets Co.  A.OTHAIM MARKET A.OTHAIM MARKET SA1230K1UGH7
....

Image [in left hand side you can find data that generates URL by searching]:

enter image description here

Upvotes: 1

TSnake
TSnake

Reputation: 480

import requests
from pprint import pprint
url = requests.get('https://www.saudiexchange.sa/tadawul.eportal.theme.helper/ThemeSearchUtilityServlet')
pprint(url.json())

The row data table can be extracted directly using this API.

Upvotes: 0

Related Questions