hey_arnold
hey_arnold

Reputation: 141

Using a table_id with beautifulsoup to extract data in python

I tried to use the following code but it doesn't find the table, despite this having worked on other web pages.

from bs4 import BeautifulSoup
from selenium import webdriver

chromedriver = (r'C:\Users\c\chromedriver.exe')

driver = webdriver.Chrome(chromedriver)
driver.get("https://isodzz.nafta.sk/yCapacity/#/?nav=ss.od.nom.c&lng=EN")

html = driver.page_source
soup = BeautifulSoup(html, "lxml")

table = soup.find_all('table', {'id':'nominations_point_data_c'})
print(table)

Upvotes: 0

Views: 134

Answers (1)

Abhishek Rai
Abhishek Rai

Reputation: 2227

Do it like this. First you need to wait for the table to appear. This site is awfully slow to load. Since there is a table element in the HTML we can use pandas for a neat print.

from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
import pandas as pd

driver = webdriver.Chrome(executable_path='C:/bin/chromedriver.exe')
driver.get("https://isodzz.nafta.sk/yCapacity/#/?nav=ss.od.nom.c&lng=EN")
element = WebDriverWait(driver, 25).until(EC.visibility_of_element_located((By.CLASS_NAME, "MobileOverflow"))) #Element is present now
page = driver.page_source #Get the HTML of the page
df = pd.read_html(page) #Make pandas read the HTML
table = df[0]   #Get the first table on the page
print(table)

Output:

 Date: Confirmed Nomination                 
         Date:      Injection [MWh] Withdrawal [MWh]
0   01.11.2020           13 410.490       11 626.856
1   02.11.2020           11 874.096       12 227.510
2   03.11.2020                0.000            0.000
3   04.11.2020                0.000            0.000
4   05.11.2020                0.000            0.000
5   06.11.2020                0.000            0.000
6   07.11.2020                0.000            0.000
7   08.11.2020                0.000            0.000
8   09.11.2020                0.000            0.000
9   10.11.2020                0.000            0.000
10  11.11.2020           34 201.032       37 624.672
11  12.11.2020           54 427.560       27 940.872
12  13.11.2020           49 069.584       21 538.372
13  14.11.2020           54 361.138       15 312.000
14  15.11.2020           57 592.332       15 804.000
15  16.11.2020           57 515.424       20 280.000
16  17.11.2020           53 315.328       29 432.000
17  18.11.2020           48 960.672       26 192.000
18  19.11.2020           46 716.561       33 873.233
19  20.11.2020           43 852.200       43 806.382
20  21.11.2020           29 639.328       33 888.000
21  22.11.2020                0.000            0.000

Upvotes: 2

Related Questions