James
James

Reputation: 274

Getting all attributes under <id> tree? (Python selenium)

I'll jump straight to the code:

<ul id="SearchResultsDetails-MainContent" class="list unstyled SearchResultsBlock">
    <li id ="result_pdb5J55" class="row oneSearchResult"></li>
    <li id ="result_pdb5LUF" class="row oneSearchResult"></li>
    <li id ="result_pdb5B1J" class="row oneSearchResult"></li>
    < ... >

I have set up a basic for loop as follows:

 data=[]
 some_objects=driver.find_elements_by_id("SearchResultsDetails-MainContent")
 for objects in some_objects:
     datum=objects.find_element_by_class_name("row_oneSearchResult").get_attribute("id")
     data.append(datum)

I am trying to scrape "result_pdb5J55", "result_pdb5LUF", etc.

I am having a lot of difficulty with this however. Either the IDE doesn't return any result or returns a NoSuchElementException.

Upvotes: 1

Views: 2024

Answers (3)

NarendraR
NarendraR

Reputation: 7708

As I observe with your html .class attribute under <li> tag having compound classes and your are trying to access those using find_element_by_class_name("row_oneSearchResult") which is not allowed by this locator.

Just change the locator to xpath selector like - find_element_by_xpath("//li[@class='row oneSearchResult']") and then try

like -

data=[] parent = driver.find_element_by_id("SearchResultsDetails-MainContent") 
some_objects = parent.find_elements_by_xpath("//li[@class='row oneSearchResult']")
for ob in some_objects:
    data.append(ob.get_attribute("id"))

Upvotes: 0

Guy
Guy

Reputation: 50864

SearchResultsDetails-MainContent is the id of the single parent element, the list is the children of that element. In addition, the children has two classes row and oneSearchResult, not row_oneSearchResult. You have several options

Locate the children using the parent selector

data=[]
some_objects = driver.find_elements_by_css_selector("#SearchResultsDetails-MainContent > .oneSearchResult")
for ob in some_objects:
    data.append(ob.get_attribute("id"))

Locate the parent element and use it to locate the children

data=[]
parent = driver.find_element_by_id("SearchResultsDetails-MainContent")
some_objects = parent.find_elements_by_class_name("oneSearchResult")
for ob in some_objects:
    data.append(ob.get_attribute("id"))

Locate the children by their locators

data=[]
# by class oneSearchResult
some_objects = parent.find_elements_by_class_name("oneSearchResult")
#or by both classes
some_objects = parent.find_elements_by_css_selector(".row.oneSearchResult")
#or by partial id
some_objects = parent.find_elements_by_css_selector("[id*='result_pdb']")
for ob in some_objects:
    data.append(ob.get_attribute("id"))

Upvotes: 1

Brian
Brian

Reputation: 3131

There is only one object in your markup that the following statement might find and that is the <ul> object.

id=some_objects=driver.find_elements_by_id("SearchResultsDetails-MainContent")

What you need to do is find that object and then loop over it's children. Try something along these lines:

data=[]
container_obj = driver.find_element_by_id("SearchResultsDetails-MainContent")
for child in container_obj.find_elements_by_tag_name("li"):
  data.append(child.get_attribute("id"))

Upvotes: 0

Related Questions