Reputation: 95
I am trying to download the 24-month data from www1.nseindia.com and it fails on Chrome and Firefox drivers. It just freezes after filling all the values in the required places and does not click. The webpage does not respond...
Below is the code that I am trying to execute:
import time
from selenium import webdriver
from selenium.webdriver.support.ui import Select
id_list = ['ACC', 'ADANIENT']
# Chrome
def EOD_data_Chrome():
driver = webdriver.Chrome(executable_path="C:\Py388\Test\chromedriver.exe")
driver.get('https://www1.nseindia.com/products/content/equities/equities/eq_security.htm')
s1= Select(driver.find_element_by_id('dataType'))
s1.select_by_value('priceVolume')
s2= Select(driver.find_element_by_id('series'))
s2.select_by_value('EQ')
s3= Select(driver.find_element_by_id('dateRange'))
s3.select_by_value('24month')
driver.find_element_by_name("symbol").send_keys("ACC")
driver.find_element_by_id("get").click()
time.sleep(9)
s6 = Select(driver.find_element_by_class_name("download-data-link"))
s6.click()
# FireFox(Gecko)
def EOD_data_Gecko():
driver = webdriver.Firefox(executable_path="C:\Py388\Test\geckodriver.exe")
driver.get('https://www1.nseindia.com/products/content/equities/equities/eq_security.htm')
s1= Select(driver.find_element_by_id('dataType'))
s1.select_by_value('priceVolume')
s2= Select(driver.find_element_by_id('series'))
s2.select_by_value('EQ')
s3= Select(driver.find_element_by_id('dateRange'))
s3.select_by_value('24month')
driver.find_element_by_name("symbol").send_keys("ACC")
driver.find_element_by_id("get").click()
time.sleep(9)
s6 = Select(driver.find_element_by_class_name("download-data-link"))
s6.click()
EOD_data_Gecko()
# Change the above final line to "EOD_data_Chrome()" and still it just remains stuck...
Kindly help with what is missing in that code to download the 24-month data... When I perform the same in a normal browser, with manual clicks, it is successful...
When you are manually doing it in a browser, you can change the values as below:
Set first drop down to : Security wise price volume data
"Enter Symbol" : ACC
"Select Series" : EQ
"Period" (radio button: "For Past") : 24 Months
Then click on the button, "Get Data", and in about 3-5seconds, the data loads, and then when you click on "Download file in CSV format", you can have the CSV file in your downloads
Need help using any library you know for scraping in Python: Selenium, Beautifulsoup, Requests, Scrappy, etc... Doesn't really matters unless it is python...
Edit: @Patrick Bormann, pls find the screenshot... The get data button works..
Upvotes: 2
Views: 751
Reputation: 11
I have edited the chromedriver.exe using hex editor and replaced cdc_
with dog_
and saved it. Then executed the below code using chrome driver.
import selenium
from selenium import webdriver
from selenium.webdriver.support.select import Select
import time
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("--disable-blink-features")
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options)
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'})
print(driver.execute_script("return navigator.userAgent;"))
# Open the website
driver.get('https://www1.nseindia.com/products/content/equities/equities/eq_security.htm')
symbol_box = driver.find_element_by_id('symbol')
symbol_box.send_keys('20MICRONS')
driver.implicitly_wait(10)
#rd_period=driver.find_element_by_id('rdPeriod')
#rd_period.click()
list_daterange=driver.find_element_by_id('dateRange')
list_daterange=Select(list_daterange)
list_daterange.select_by_value('24month')
driver.implicitly_wait(10)
btn_getdata=driver.find_element_by_xpath('//*[@id="get"]')
btn_getdata.click()
driver.implicitly_wait(100)
print("Clicked button")
lnk_downloadData=driver.find_element_by_xpath('/html/body/div[2]/div[3]/div[2]/div[1]/div[3]/div/div[3]/div[1]/span[2]/a')
lnk_downloadData.click()
This code is working fine as of now. But the problem is that - this is not a permanent solution. NSE keeps on updating the logic to detect BOT execution in a better way. Like NSE, we will also have update our code. Please let me know if this code is not working. Will figure out some other solution.
Upvotes: 1
Reputation: 749
When you say that it works manually, have you try to simulate a click with action chains instead of the internal click function
from selenium.webdriver.common.action_chains import ActionChains
easy_apply = Select(driver.find_element_by_id('dateRange'))
actions = ActionChains(driver)
actions.move_to_element(easy_apply)
actions.click(easy_apply)
actions.perform()
and then you simulate a mouse movement to the specific value?
In addition, I tried it on my own and I didnt get any data when pushing on the button Get Data, as it seems to have a class of "get" as you mentioned, but this button doesnt work, but as you can see there exists a second button called full download, perhaps ypu try to use this one? Because the GetData Button doesnt work on Firefox and Chrome (when i tested it).
Did you already try to catch it through the link?
Update
As the OP asks for help in this urgent matter I delivered a working solution.
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
import time
from selenium.webdriver.support.ui import Select
chrome_driver_path = "../chromedriver.exe"
driver = webdriver.Chrome(executable_path=chrome_driver_path)
driver.get('https://www1.nseindia.com/products/content/equities/equities/eq_security.htm')
driver.execute_script("document.body.style.zoom='zoom 25%'")
time.sleep(2)
price_volume = driver.find_element_by_xpath('//*[@id="dataType"]/option[2]').click()
time.sleep(2)
date_range = driver.find_element_by_xpath('//*[@id="dateRange"]/option[8]').click()
time.sleep(2)
series = driver.find_element_by_name('series')
time.sleep(2)
drop = Select(series)
drop.select_by_value("EQ")
time.sleep(2)
driver.find_element_by_name("symbol").send_keys("ACC")
ez_download = driver.find_element_by_xpath('//*[@id="wrapper_btm"]/div[1]/div[3]/a')
actions = ActionChains(driver)
actions.move_to_element(ez_download)
actions.click(ez_download)
actions.perform()
Here you go, sorry, took a little, had to bring my son to bed...
This solution provides this output: I hope its correct. If you want to select other drop down menus you can change the string in the select (string because of too much indezes too handle) or the number in the xpath as the number highlights the index. The time is normally only for elements which need time to build themselves up on a webpage. But I made the experience that a too fast change sometimes causes errors. Feel free to change the time limit and see if it still works.
I hope you can now go on again in making some money for your living in India. All the best Patrick,
Do not hesitate to ask if you have any questions.
UPDATE2
After one long night and another day we figured out that the Freezing originates from the website, as the website uses:
Boomerang | Akamai Developer developer.akamai.com/tools/… Boomerangis a JavaScript library forReal User Monitoring (commonly called RUM). Boomerang measures the performance characteristics of real-world page loads and interactions. The documentation on this page is for mPulse’s Boomerang. General API documentation for Boomerang can be found atdocs.soasta.com/boomerang-api/. . What I discovered from the html header.
This is clearly a bot detection network/javascript. With the help of this SO post: Can a website detect when you are using Selenium with chromedriver?
And the second paragraph from that post:https://piprogramming.org/articles/How-to-make-Selenium-undetectable-and-stealth--7-Ways-to-hide-your-Bot-Automation-from-Detection-0000000017.html
I finally solved the issue: we changed the
var_key in chromedriver to something else like:
var key = '$dsjfgsdhfdshfsdiojisdjfdsb_';
In addition I changed the code to:
import time
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.ui import Select
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
chrome_driver_path = "../chromedriver.exe"
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument('--disable-blink-features=AutomationControlled')
driver = webdriver.Chrome(executable_path=chrome_driver_path, options=options)
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
driver.get('http://www1.nseindia.com/products/content/equities/equities/eq_security.htm')
driver.execute_script("document.body.style.zoom='zoom 25%'")
time.sleep(5)
price_volume = driver.find_element_by_xpath('//*[@id="dataType"]/option[2]').click()
time.sleep(3)
date_range = driver.find_element_by_xpath('//*[@id="dateRange"]/option[8]').click()
time.sleep(5)
series = driver.find_element_by_name('series')
time.sleep(3)
drop = Select(series)
drop.select_by_value("EQ")
time.sleep(4)
driver.find_element_by_name("symbol").send_keys("ACC")
actions = ActionChains(driver)
ez_download = driver.find_element_by_xpath('/html/body/div[2]/div[3]/div[2]/div[1]/div[3]/div/div[1]/form/div[2]/div[3]/p/img')
actions.move_to_element(ez_download)
actions.click(ez_download)
actions.perform()
#' essential because the button has to be loaded
time.sleep(5)
driver.find_element_by_class_name('download-data-link').click()
The code finally worked and the OP is happy.
Upvotes: 1