Reputation: 2459
I wish to extract page data from https://warthunder.com/en/community/userinfo/?nick=Hunter_i86 ( with Hunter_i86 as an example nickname ) for a discord bot in order to get the war thunder stats for the players in the discord chats.
When loading the page, it does an infinite loop on the cloudflare loading screen and does so indefinitely. I have tried both Firefox and Chrome with the latest versions and both will work fine until controlled by selenium for warthunder.com ( no issues on other websites ).
Disclaimer : I am only resorting to ing as I did not find any official APIs at all. I have tried asking Gaijin ( the society that manages war thunder ) and they have told me there is none. I also have no intention of having more than one request per player per 24h such as https://thunderskill.com/en is already doing ( witch also tells me that scraping warthunder.com is possible ).
I have also tried to make selenium undetectable by following this link Can a website detect when you are using Selenium with chromedriver? ( the driver would no longer work ) and this one Way to change Google Chrome user agent in Selenium? but to no avail either due to an update on the web browsers, the drivers or selenium ( have tried Firefox and Chrome but with the exact same results )
So far I think that it's Selenium that is being detected but without being certain, I am unsure as to what to do to get through, any help is most appreciated.
Upvotes: 0
Views: 1527
Reputation: 19939
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ['enable-automation'])
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_argument("user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36")
options.add_argument("--remote-debugging-port=9222")
driver = webdriver.Chrome(options=options)
driver.execute_script(
"window.open('https://warthunder.com/en/community/userinfo/?nick=Hunter_i86')")
It seems there is a redirect issue with driver.get() using windows.open works , the options are used to avoid automation detection
Upvotes: 2