Migatte
Migatte

Reputation: 29

Getting TypeError: WebDriver.__init__() got an unexpected keyword argument 'desired_capabilities' despite 'desired_capabilities' being deprecated

I'm creating a scraper connected to a Telegram bot that collects offers on Amazon. However, I'm running into problems when I try to use a proxy server. The problem I encountered with the code attached below is that it gives me the following error:

File "/home/X/.local/lib/python3.10/site-packages/seleniumwire/webdriver.py", line 308, in __init__
    super().__init__(*args, **kwargs)
TypeError: WebDriver.__init__() got an unexpected keyword argument 'desired_capabilities'

Even though I read that desired_capabilities has been deprecated with the version of Selenium I use (4.11.2), why does it still give me this error?

I attach the code I'm using. I specify that the proxy I use in this case is from ScraperApi, which in the documentation at this link has indicated the code to use for this type of requests.

from seleniumwire import webdriver

def start_selenium():
    chromium_options = webdriver.ChromeOptions()  
    chromium_options.add_argument("--headless")
    chromium_options.add_argument("--disable-extensions")
    chromium_options.add_argument("--disable-infobars")
    chromium_options.add_argument("--disable-notifications")
    chromium_options.add_argument("--disable-translate")
    chromium_options.add_argument("--incognito")   

    proxy_options = {
        'proxy': {
            'http': f'http://scraperapi:os.environ["SCRAPERAPI_KEY"]@proxy-server.scraperapi.com:8001',        
            'https': f'http://scraperapi:os.environ["SCRAPERAPI_KEY"]@proxy-server.scraperapi.com:8001',
            'no_proxy': 'localhost,127.0.0.1'
        }
    }

    chromium_driver = webdriver.Remote(command_executor='http://localhost:4444/wd/hub', 
                            options=chromium_options, 
                            seleniumwire_options=proxy_options)
    
    return chromium_driver

Also, still attached to this question, would it be possible to use my original code to wait for items to load even with the proxy? I also tried with the ScrapeOps Proxy but, although I managed to start it locally, it sent me too many requests in a short time, using the code below:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

def get_all_deals_ids():

    deals_page = "https://www.amazon.it/......."

    selenium_driver = start_selenium()

    try:
        selenium_driver.get(deals_page)

        WebDriverWait(selenium_driver, 60).until(EC.presence_of_element_located((By.CSS_SELECTOR, "a[class*='DealCard']"))) 

        elements_urls = [e.get_attribute("href") for e in selenium_driver.find_elements(By.CSS_SELECTOR, "a[class*='DealCard']")]

        deals_urls = []
        for url in elements_urls:
            if is_product(url):
                deals_urls.append(url)
            if ('/deal/' in url) or ('/browse/' in url):  
                deals_urls = deals_urls + get_submenus_urls(url)

        product_ids = {}
        for i, url in enumerate(deals_urls, start=1):
            if extract_product_id(url) is not None and extract_product_id(url) != '':
                product_id = extract_product_id(url)
                if product_id not in product_ids:
                    product_ids[product_id] = i

        selenium_driver.quit()
        return [*product_ids.items()]
        
    except Exception as e:
        print(e)
        selenium_driver.quit()
        return [] 

Thanks in advance to anyone who can give me some suggestions!

I tried to run the code attached at the beginning with the ScraperApi Proxy, but I always received the desired_capabilities error. I therefore tried with another Proxy, this time with local webdriver instead of remote, but the result was that it made calls to any link it encountered on the deals_page, causing me to needlessly consume precious calls.

Upvotes: 1

Views: 1053

Answers (1)

Michael Mintz
Michael Mintz

Reputation: 15556

It's failing because the selenium-wire library is still using desired_capabilities in the Remote() definition: https://github.com/wkeeling/selenium-wire/blob/master/seleniumwire/webdriver.py#L298

Unfortunately, this issue is not likely to be fixed due to the library being archived: https://github.com/wkeeling/selenium-wire

...which means you'll either need to fork the repo yourself to make that fix, find someone else who already did that, or downgrade your selenium version to one where desired_capabilities still exists.

Or maybe you can do what you want to do without using selenium-wire at all, or just not using Remote(). Most of selenium-wire still works with the latest version of selenium, just not the method you're trying to use.

Upvotes: 0

Related Questions