Reputation: 69
So, I'm trying to figure out how to scrape a directory site for some information and I'm having trouble combining two sets of get_elements into a writable CSV file. I have two lists of information company_links_elements
and company_address_elements
. Ultimately what I'd like to do is print the following information to a CSV. The problem is I don't know how to run two for loops
together or zip the arrays
. Can you help me figure out how to get a CSV with the following three rows: company_name
, company_url
, and company_address
.
company_links_elements = driver.find_elements(By.XPATH, "//h3[@class='jss320
jss324 jss337 sc-gzOgki eucExu']/a")
company_address_elements = driver.find_elements(By.XPATH, "//strong[@class='dtm-search-listing-address']")
with open('links.csv', 'w') as file:
writer = csv.writer(file)
for company in company_links_elements:
company_url = company.get_attribute("href")
company_name = company.get_attribute("text")
# NEED COMPANY ADDRESS HERE
writer.writerow((company_name, company_url))
driver.close()
Notice the company_address_elements... I don't know how to include that into the csv.writer to write the additional column for the company_address.
Upvotes: 1
Views: 72
Reputation: 20077
Here's the zip version:
for company, address in zip(company_links_elements, company_address_elements):
company_url = company.get_attribute("href")
company_name = company.get_attribute("text")
company_address = address. get_attribute("text")
On every iteration company
and address
will be the corresponding element in the lists, at the same index. Will stop when the shorter list is exhausted. Benefit over enumerate - you won't hit IndexError if one of the list is shorter than the other; negative - you don't have the current index at hand (but you don't use it anyways:).
Upvotes: 2
Reputation: 10666
for idx, company in enumerate(company_links_elements):
company_url = company.get_attribute("href")
company_name = company.get_attribute("text")
address = company_address_elements[idx].get_attribute(...)
writer.writerow((company_name, company_url))
Upvotes: 1