Web scraping with BeautifulSoup - when trying to find table the content is not returned

Question

I am trying to scrape a website for a table but only the header is being returned.

I am new to python and web scraping and have followed the following material which was very helpful https://medium.com/analytics-vidhya/how-to-scrape-a-table-from-website-using-python-ce90d0cfb607.

However, the following code only returns the header and not the body of the table.

# Create an URL object
url = 'https://www.dividendmax.com/dividends/declared'
# Create object page
page = requests.get(url)

req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
page = urlopen(req)
html = page.read().decode("utf-8")
soup = BeautifulSoup(html, "html.parser")

# Obtain information from tag 
table1 = soup.find_all('table')
table1
Output:
[



 
 
 Company
 Ticker
 Country
 Exchange
 Share Price
 Prev. Dividend
 Next Dividend
 Next Ex-date
 
 
 
 
]


I need to retrieve the tbody content (found when expanding the penultimate row of output).
Just as an FYI, the following code will be used to create the dataframe.
# Obtain every title of columns with tag 
headers = []
for i in table1.find_all('th'):
    title = i.text
    headers.append(title)

# Create a dataframe
mydata = pd.DataFrame(columns = headers)

# Create a for loop to fill mydata
for j in table1.find_all('tr')[1:]:
    row_data = j.find_all('td')
    row = [i.text for i in row_data]
    length = len(mydata)
    mydata.loc[length] = row

chitown88 · Accepted Answer

The page you are after is not the same as the tutorial. Probably not the best site if your trying to learn/practice with beautifulsoup. But the data for me comes back in a nice json format.

import requests
import pandas as pd

# Create an URL object
url = 'https://www.dividendmax.com/dividends/declared'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36'}

jsonData = requests.get(url, headers=headers).json()
df = pd.DataFrame(jsonData)

Output:

print(df)
                                            name  ...                 ind
0                                   3i Group plc  ...  [22, 25, 23, 3, 5]
1                          3I Infrastructure Plc  ...              [4, 5]
2                                AB Dynamics plc  ...                  []
3    Aberdeen Smaller Companies Income Trust plc  ...                  []
4      Aberdeen Standard Equity Income Trust plc  ...                  []
..                                           ...  ...                 ...
146                              Workspace Group  ...      [25, 4, 24, 5]
147                          Wynnstay Properties  ...                  []
148                                 XP Power Ltd  ...              [5, 4]
149                           Yew Grove REIT Plc  ...                  []
150                                       Yougov  ...                  []

[151 rows x 11 columns]

Web scraping with BeautifulSoup - when trying to find table the content is not returned

Answers (2)

Related Questions