Kyle Linden
Kyle Linden

Reputation: 69

Combining multiple for loops into CSV with selenium

So, I'm trying to figure out how to scrape a directory site for some information and I'm having trouble combining two sets of get_elements into a writable CSV file. I have two lists of information company_links_elements and company_address_elements. Ultimately what I'd like to do is print the following information to a CSV. The problem is I don't know how to run two for loops together or zip the arrays. Can you help me figure out how to get a CSV with the following three rows: company_name, company_url, and company_address.

company_links_elements = driver.find_elements(By.XPATH, "//h3[@class='jss320 
jss324 jss337 sc-gzOgki eucExu']/a")
company_address_elements = driver.find_elements(By.XPATH, "//strong[@class='dtm-search-listing-address']")

with open('links.csv', 'w') as file:
    writer = csv.writer(file)
    for company in company_links_elements:
        company_url = company.get_attribute("href")
        company_name = company.get_attribute("text")
        # NEED COMPANY ADDRESS HERE
        writer.writerow((company_name, company_url))

driver.close()

Notice the company_address_elements... I don't know how to include that into the csv.writer to write the additional column for the company_address.

Upvotes: 1

Views: 72

Answers (2)

Todor Minakov
Todor Minakov

Reputation: 20077

Here's the zip version:

for company, address in zip(company_links_elements, company_address_elements):
    company_url = company.get_attribute("href")
    company_name = company.get_attribute("text")
    company_address = address. get_attribute("text")

On every iteration company and address will be the corresponding element in the lists, at the same index. Will stop when the shorter list is exhausted. Benefit over enumerate - you won't hit IndexError if one of the list is shorter than the other; negative - you don't have the current index at hand (but you don't use it anyways:).

Upvotes: 2

gangabass
gangabass

Reputation: 10666

for idx, company in enumerate(company_links_elements):
        company_url = company.get_attribute("href")
        company_name = company.get_attribute("text")
        address = company_address_elements[idx].get_attribute(...)
        writer.writerow((company_name, company_url))

Upvotes: 1

Related Questions