shongyang low
shongyang low

Reputation: 77

Python > Selenium + CSV: How to scrape from a website and write 2 columns onto CSV?

In need of some troubleshooting for some code that does the following:

1) Scrape links from a webpage
2) Scrape text for the links, from the same page

Had some success in extracting links and writing as a single column:

elements = driver.find_elements_by_xpath("//a[@href]")
with open('csvfile01.csv', "w", newline='') as output:
    writer = csv.writer(output)
    for element in elements:
        writer.writerow([element.get_attribute("href")])

Unfortunately, was stuck when it came to:
1) getting the "text" for the links, and
2) exporting it as a separate column...
3) scraping a specific part of the webpage for links, e.g. in a table ("td") or a div section

The code as it stands now:

from selenium import webdriver
import time
import csv

driver = webdriver.Chrome()


driver.get("https://en.wikipedia.org/wiki/Main_Page")
time.sleep(5)

columns = ['text', 'link']

e1 = driver.find_element_by_css_selector("a")
e2 = driver.find_elements_by_xpath("//a[@href]")
elements = zip(e1,e2)


time.sleep(5)

with open('csvfile01.csv', "w", newline='') as output:

    writer = csv.writer(output)

    for element in elements:
        writer.writerow(columns)
        writer.writerows(elements)

driver.quit()

Any suggestions would be much appreciated. Thanks!

Upvotes: 1

Views: 125

Answers (1)

Noah
Noah

Reputation: 174

As far as the getting the text goes , you can do .text , also your css selector dosent seem right considering it is only “a”, to get an xpath/css selector just inspect the element and right click it then click copy then you get a list of things to copy, I do not use selenium much but when I did use it I noticed in the xpath that only 1 number would change (like if it’s a table of proxies) so I just defined a counter and incremented it in a loop

Upvotes: 1

Related Questions