Reputation: 13
I have been using BeautifulSoup to scrape the pricing information from "https://www.huaweicloud.com/pricing.html#/ecs"
I want to extract the table information of that website, but I get nothing.
I am using Windows 10 , the latest BeautifulSoup , Request and Python3.7
import requests
from bs4 import BeautifulSoup
url = 'https://www.huaweicloud.com/pricing.html#/ecs'
headers = {'User-Agent':'Mozilla/5.0'}
response = requests.get(url,headers=headers)
soup = BeautifulSoup(response.content,'html.parser')
soup.find_all('table')
After running the soup.find_all('table')
, it returns an empty list: []
Upvotes: 1
Views: 77
Reputation: 109
I know this is not the answer to your question, but this might help you. This is the code I came up with using selenium & BeautifulSoup. You just have to specify the location of chromedriver, and the script is good to go.
from selenium import webdriver
import time
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://www.huaweicloud.com/pricing.html#/ecs'
driver = webdriver.Chrome("location of chrome driver")
driver.get(str(url))
driver.find_element_by_id("calculator_tab0").click()
time.sleep(3)
html_source = driver.page_source
soup = BeautifulSoup(html_source, features="lxml")
table_all = soup.findAll("table")
output_rows = []
for table in table_all[:2]:
for table_row in table.findAll('tr'):
thead = table_row.findAll('th')
columns = table_row.findAll('td')
_thead = []
for th in thead:
_thead.append(th.text)
output_rows.append(_thead)
_row = []
for column in columns:
_row.append(column.text)
output_rows.append(_row)
output_rows = [x for x in output_rows if x != []]
df = pd.DataFrame(output_rows)
Upvotes: 1