Reputation: 37
I am trying to grab a table from this page https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm and other similar pages.
The table in question has a dynamic id table-XXXX
where X is a different number each time the page loads.
The table has the following attributes:
class="tablesaw tablesaw-stack table-bordered table-centered rates-availability-table"
data-tablesaw-mode="stack"
I have tried the following variants to locate this table (having consulted this post How to find element by part of its id name in selenium with python) but nothing seems to work.
find_elements_by_css_selector("[id*='tab']")
find_elements_by_css_selector("[class*='tablesaw']")
find_elements_by_css_selector("[data-tablesaw-mode*='stack']")
Upvotes: 2
Views: 478
Reputation: 193208
The table WebElement are AJAX elements so to print the values you have to induce WebDriverWait for the visibility_of_element_located()
and you can use either of the following Locator Strategies:
Using CSS_SELECTOR
:
driver.get('https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm')
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table.tablesaw.tablesaw-stack.table-bordered.table-centered.rates-availability-table"))).text)
Using XPATH
:
driver.get('https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm')
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//table[@class='tablesaw tablesaw-stack table-bordered table-centered rates-availability-table']"))).text)
Console Output:
Start Date End Date 3 Nights 4 Nights 5 Nights 6 Nights 7 Nights
28 Mar 2020 1 May 2020 £225 £300 £350 £410 £470
2 May 2020 26 Jun 2020 £250 £330 £400 £460 £530
27 Jun 2020 3 Jul 2020 - - - - £675
4 Jul 2020 10 Jul 2020 - - - - £920
11 Jul 2020 14 Aug 2020 - - - - £985
15 Aug 2020 21 Aug 2020 - - - - £920
22 Aug 2020 28 Aug 2020 - - - - £675
29 Aug 2020 31 Oct 2020 - - - - £470
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Upvotes: 1
Reputation: 195553
The data is loaded dynamically via JavaScript. But you can use their API to load the table.
For example:
import requests
from bs4 import BeautifulSoup
url = 'https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm'
rates_url = 'https://www.holidayfrancedirect.co.uk/api/property-rates/{property_id}/2020'
property_id = url.split('/')[-2]
data = requests.get(rates_url.format(property_id=property_id)).json()
soup = BeautifulSoup(data['ratesHtml'], 'html.parser')
# print table to screen:
for tr in soup.select('tr'):
tds = [td.get_text(strip=True) for td in tr.select('td, th')]
print(('{:<15}'*7).format(*tds))
Prints:
Start Date End Date 3 Nights 4 Nights 5 Nights 6 Nights 7 Nights
28 Mar 2020 1 May 2020 £225 £300 £350 £410 £470
2 May 2020 26 Jun 2020 £250 £330 £400 £460 £530
27 Jun 2020 3 Jul 2020 - - - - £675
4 Jul 2020 10 Jul 2020 - - - - £920
11 Jul 2020 14 Aug 2020 - - - - £985
15 Aug 2020 21 Aug 2020 - - - - £920
22 Aug 2020 28 Aug 2020 - - - - £675
29 Aug 2020 31 Oct 2020 - - - - £470
Upvotes: 0