Reputation: 642
I am attempting to pull the table out of this https://www.worldometers.info/coronavirus/country/us/
Here is the code I am using
from selenium.webdriver.chrome.options import Options
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
extension = r'cjpalhdlnbpafiamejdnhcphjbkeiagm.crx'
chrome_options = Options()
chrome_options.add_extension(extension)
driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=r'chromedriver.exe')
url = 'https://www.worldometers.info/coronavirus/country/us/'
xpath = '//*[@id="usa_table_countries_today"]'
driver.get(url);
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, xpath))
)
except:
print("error")
driver.close()
finally:
element = driver.find_element_by_xpath(xpath)
element.screenshot_as_png("test.png")
driver.close()
I get the following error.
Traceback (most recent call last): File "C:\Users\someUser\PycharmProjects\project\venv\lib\site-packages\urllib3\connection.py", line 160, in _new_conn (self._dns_host, self.port), self.timeout, **extra_kw File "C:\Users\someUser\PycharmProjects\project\venv\lib\site-packages\urllib3\util\connection.py", line 84, in create_connection raise err File "C:\Users\someUser\PycharmProjects\project\venv\lib\site-packages\urllib3\util\connection.py", line 74, in create_connection sock.connect(sa) ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
I also have attempted to use this code to get the table.
finally:
element = driver.find_element_by_xpath(xpath)
location = element.location;
size = element.size;
driver.save_screenshot("pageImage.png");
x = location['x'];
y = location['y'];
width = location['x'] + size['width'];
height = location['y'] + size['height'];
im = Image.open('pageImage.png')
im = im.crop((int(x), int(y), int(width), int(height)))
im.save('element_image.png')
driver.close()
But the above code gets the wrong section of the page.
To troubleshoot that I also attempted with and without adding uBlock origin to selenium. In both cases, the issue persists.
Any advice or help to get me going in the correct direction would be greatly appreciated!
Upvotes: 0
Views: 958
Reputation: 1836
Try below code - Have increased the page height to get a full page snapshot. This covers your table also.
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
def test_fullpage_screenshot():
chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--start-maximized')
driver = webdriver.Chrome(options=chrome_options)
driver.get("https://www.worldometers.info/coronavirus/country/us/")
time.sleep(5)
# the element with longest height on page
ele = driver.find_element("xpath", '//*[@id="usa_table_countries_today"]')
total_height = ele.size["height"] + 1000
driver.set_window_size(1920, total_height) # the trick
time.sleep(2)
driver.save_screenshot("screenshot1.png")
driver.quit()
if __name__ == "__main__":
test_fullpage_screenshot()
Not - You can increase/decrease the height as per your requirement.
Upvotes: 1