lajulajay
lajulajay

Reputation: 365

Scraping hidden elements

My question is twofold:

1) I'm attempting to log on to this page, source is here, using the code below. It's fine to use the credentials I've provided, which will expire 28 days from now but it's relatively painless to create a trial account for those viewing this content after that.

from selenium import webdriver
driver_path = 'Path to my downloaded chromedriver.exe file'
url_login = 'https://www.findacode.com/signin.html'
username = '[email protected]'
password = 'm%$)-Y95*^.1Gin+'

options = webdriver.ChromeOptions()
options.add_argument('headless')
driver = webdriver.Chrome(executable_path=driver_path, chrome_options=options)

driver.get(url_login)
assert '_submit_check' in driver.page_source
driver.find_element_by_name('id').send_keys(username)
driver.find_element_by_name('password').send_keys(password)
driver.find_element_by_xpath("//input[@value='Sign In']").submit()

I receive the following error for all 3 elements:

selenium.common.exceptions.ElementNotVisibleException: Message: element not visible

My command of html/css/javscript isn't as strong but I've tried using waits per this thread and received a timeout. Was going to try ActionChains from that thread next but love to hear from someone with more knowledge on this about how to proceed.

2) Ultimately I want to scrape specific code history data from this url (source here) by varying the code (last 5 characters of the url) in a loop. A user has to be logged in, hence my first question above, and the way to view the information I'm after in the browser is to expand the light purple "Code History" table. The specific information I'm after is the date from any row where the Action column is 'Added' and the Notes column is 'Code Added':

Date       Action Notes 
2018-01-01 Added  First appearance in code
2017-02-01 Added  Code Added

My question here is since the table, which I believe is hidden, needs to be expanded with a click to expose the data I'm after, how do I proceed?

Edit Here's code, pseudo code and commentary to explain my 2nd question.

url_code = "https://www.findacode.com/code.php?set=CPT&c="
driver.get(url_code+'0001U') # i'm presuming that this will preserve the login session
driver.find_element_by_id('history').click() # i intend for this to expand the Code History section and expose the table shown earlier in the post but it's not doing that
check whether the phrase "Code Added" occurs in page source
if so, grab the date that is in the <td nowrap> tag that is 2 tags to the left

I can use BeautifulSoup for the last two lines if not possible with Selenium but I need help understanding why I'm not seeing the data I want to scrape

Upvotes: 2

Views: 1057

Answers (2)

undetected Selenium
undetected Selenium

Reputation: 193108

To login into this website you need to induce WebDriverWait for the desired elements to be clickable and you can use the following solution:

  • Code Block:

    from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
    driver_path = 'Path to my downloaded chromedriver.exe file'
    url_login = 'https://www.findacode.com/signin.html'
    username = '[email protected]'
    password = 'm%$)-Y95*^.1Gin+'
    
    options = webdriver.ChromeOptions()
    options.add_argument('--headless')
    options.add_argument("start-maximized")
    options.add_argument('disable-infobars')
    driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    driver.get(url_login)
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//form[@name='login']//input[@name='id']"))).send_keys(username)
    driver.find_element_by_xpath("//form[@name='login']//input[@name='password']").send_keys(password)
    driver.find_element_by_xpath("//form[@name='login']//input[contains(@value,'Sign In')]").submit()
    print("Logged In successfully")
    
  • Console Output:

    Logged In successfully
    

Upvotes: 1

Andersson
Andersson

Reputation: 52665

There are two forms on page with inputs @name="id", @name="password" and "Sign in" button. The first one is hidden. You need to handle form with @name="login":

form = driver.find_element_by_name('login')
form.find_element_by_name('id').send_keys(username)
form.find_element_by_name('password').send_keys(password)
form.find_element_by_xpath("//input[@value='Sign In']").submit()

Upvotes: 3

Related Questions