Reputation: 3
For my master thesis, I am exploring the possibility to extract data from a website via web automation. The steps are as follows:
I am stuck in steps 5, 6 and 7
from selenium import webdriver
DRIVER_PATH = 'C:\webdriver\chromedriver.exe' driver = webdriver.Chrome(executable_path=DRIVER_PATH, chrome_options=ChromeOptions)
driver.maximize_window()
driver.get('https://www.metal.com/Copper/201102250376')
#Login steps LoginClick1 = driver.find_element_by_css_selector( '#__next > div > div.smm-component-header-en > div.main > div.right > button.button.sign-in')
LoginClick1.click()
user_input = driver.find_element_by_id('user_name') user_input.send_keys('#####')
password_input = driver.find_element_by_id('password') password_input.send_keys('####')
Submit = driver.find_element_by_css_selector( 'body > div:nth-child(17) > div > div.ant-modal-wrap.ant-modal-centered.smm-component-sign-en > div > div.ant-modal-content > div > div > div > div.smm-component-sign-en-content > form > div:nth-child(3) > div > div > span > button')
Submit.click()
time.sleep(2)
#scroll down the point of interest in page driver.execute_script("window.scrollBy(0,1000)", "")
#change currency driver.find_element(By.XPATH,"//img[contains(@class,'icon___BUqam')]").click()
time.sleep(1)
#change date from datepicker
date_input = driver.find_element_by_xpath( '//*[@id="__next"]/div/div[5]/div1/div[7]/div1/div2/div1/span1/div/i')
date_input.click()
action = ActionChains(driver)
action.move_to_element(date_input).send_keys(Keys.BACKSPACE).send_keys( Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).send_keys(Keys.BACKSPACE).perform()
action.move_to_element(date_input).send_keys("01/01/2020").perform() action.move_to_element(date_input).send_keys(Keys.ENTER).perform()
time.sleep(2)
I am stuck trying to scrape the data from the table generated and then save into a csv file using selenium. See HTML code below table generated
**May 27, 2022** **10,758.75-10,788.43** **10,773.59** **+97.94** **USD/mt**Any help would be massively appreciated.
Download file using button press Download button
driver.find_element(By.XPATH,"//img[contains(@src,'https://static.metal.com/www.metal.com/4.1.161/static/images/price/download.png')]").click()
time.sleep(1)
driver.find_element(By.XPATH,"//img[contains(@src,'https://static.metal.com/www.metal.com/4.1.161/static/images/price/download_excel.png')]").click()
To save time since I have multiple files/data to download, I am also exploring the possibility of directly saving the file via the download button provided.
Have you any idea on how to go about this?
Upvotes: 0
Views: 558
Reputation: 346
The reason sign in
button is not getting clicked is because the xpath //*[@id="__next"]/div/div[3]/div[2]/div[2]/button[2]
is incorrect the id
of next
is the main container div
through which we are naviagting to the sign button
by providing remaining html nodre structure
Instead you can directly select the sign in button as //button[@class='button sign-in']
based on its class value
Your solution for sign in would look like
driver = webdriver.Chrome(executable_path='C:\webdrivers\chromedriver.exe')
driver.maximize_window()
driver.get('https://www.metal.com/Nickel/201102250239')
# Click on Sign In
driver.find_element(By.XPATH, "//button[@class='button sign-in']").click()
# Enter username
driver.find_element(By.ID, "user_name").send_keys("your username")
# Enter password
driver.find_element(By.ID, "password").send_keys("your password")
# Click Sign In
driver.find_element(By.XPATH, "//button[@type='submit']").click()
To scrape data
for element in driver.find_elements_by_class_name("historyBodyRow___1Bk9u"):
elements =element.find_elements_by_tag_name("div")
print("Date="+ elements[0].text)
print("Price Range="+ elements[1].text)
print("Avg="+ elements[2].text)
print("Change="+ elements[3].text)
print("Unit="+ elements[4].text)
Add To CSV
import csv
f = open('Path where you want to store the file', 'w')
writer = csv.writer(f)
for element in driver.find_elements_by_class_name("historyBodyRow___1Bk9u"):
elements =element.find_elements_by_tag_name("div")
entry = [elements[0].text ,elements[1].text ,elements[2].text , elements[3].text, elements[4].text]
writer.writerow(entry)
f.close
Upvotes: 1