Windstorm1981
Windstorm1981

Reputation: 2680

Python Selenium find_elements_by_class_name Error

I am scraping a google page that has returned links to Linkedin profiles.

I want to collect the links on a page and put them in a python list.

Problem is I can't seem to properly extract them from the page and I don't know why.

Google source code looks like this:

The page displays 10 of the following:

Mary Smith - Director of Talent Acquisition ...
https://www.linkedin.com › marysmith
Anytown, Arizona 500+ connections ... Experienced Talent Acquisition Director, with a 
demonstrated history of working in the marketing and advertising ...

The source code looks like this:

<div data-hveid="CAIQAA" data-ved="2ahUKEwjLv6HMr4HmAhWluVkKHfjfA1EQFSgAMAF6BAgCEAA">
   <div class="rc"> 
       <div class="r">
           <a href="https://www.linkedin.com/in/marysmith" ping="/url?sa=t&amp;source=web&amp;rct=j&amp;url=https://www.linkedin.com/in/marysmith&amp;ved=2ahUKEwjLv6HMr4HmAhWluVkKHfjfA1EQFjABegQIAhAB">
               <h3 class="LC20lb"><span class="S3Uucc">Mary Smith - Director of Talent Acquisition, Culture Curator ...</span></h3><br>
               <div class="TbwUpd">
                   <cite class="iUh30 bc">https://www.linkedin.com › marysmith</cite>
              </div>
           </a>
           ...

In my script I'm using Selenium and find_element_by_class_name()to collect all the instances of the links to Linkedin. The one in the above example is https://www.linkedin.com › marysmith. It is one line of code where I use driver.find_element_by_class_name() with the particular class name:

linkedin_urls = driver.find_element_by_class_name("iUh30 bc")

However I get the following error:

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[name="iUh30 bc"]"}

I've tried various permutations and other classes but it won't work. If I use the X_Path for one of those links the script WILL return that single link.

What am I doing wrong?

Upvotes: 0

Views: 524

Answers (2)

Ahmed Soliman
Ahmed Soliman

Reputation: 1710

Websites like Google and Facebook use an AI to construct the pages sources and assign random classes that's why you are getting no such element because every time you load that page the class's value varies To solve this issue try to use constant tags or attributes.

Try something like:

#<cite class="iUh30 bc">https://www.linkedin.com › mary-smith-mckenzie-8b660799</cite>
driver.find_elements_by_xpath("//cite[contains(text(),'›') and contains(text(),'linkedin.com')]")

Upvotes: 1

pguardiario
pguardiario

Reputation: 55012

That method is known to be buggy. Try:

driver.find_element_by_css_selector(".iUh30.bc")

Upvotes: 0

Related Questions