Web scraping with selenium to txt

Question

I would scrape ids from this page https://www.flashscore.co.uk/football/russia/premier-league/results/ Then replace g_1_ with https://www.flashscore.com/match/ and import these urls to txt file.

I used this code

matches=WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[starts-with(@id,'g_1_')]")))

for match in matches:
    g1 = matches.replace("g_1_", "https://www.flashscore.com/match/")
    print(g1)

But I got this error

AttributeError: 'list' object has no attribute 'replace'

id that i want to scrape

chitown88 · Accepted Answer

First, as stated in the comments, .replace() is a method to be applied on a string. You have matches, which is a list object (of WebElements) which throws the error 'list' object has no attribute 'replace'' You need to iterate through your list of WebElements, which you did define with for match in matches:, and then grab the id attribute as a string with .get_attribute() in order to use the replace() method.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


#Initializing the webdriver
options = webdriver.ChromeOptions()

#Uncomment the line below if you'd like to scrape without a new Chrome window every time.
#options.add_argument('headless')

#Change the path to where chromedriver is in your home folder.
driver = webdriver.Chrome('C:/chromedriver_win32/chromedriver.exe', options=options)
driver.maximize_window()

url = 'https://www.flashscore.co.uk/football/russia/premier-league/results/'
driver.get(url)
matches=WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[starts-with(@id,'g_1_')]")))

for match in matches:
    g1 = match.get_attribute('id')
    g1 = g1.replace("g_1_", "https://www.flashscore.com/match/")
    print(g1)
    
driver.close()

You could also combine that into a one-liner

g1 = match.get_attribute('id').replace("g_1_", "https://www.flashscore.com/match/")

Output:

https://www.flashscore.com/match/hWhb9Uyh
https://www.flashscore.com/match/rLoB6SLA
https://www.flashscore.com/match/zer38lib
https://www.flashscore.com/match/Eos77864
https://www.flashscore.com/match/4zzK46jN
https://www.flashscore.com/match/tdkfAAMo
https://www.flashscore.com/match/MBpF5nyH
https://www.flashscore.com/match/IwvO3Q5T
https://www.flashscore.com/match/nysS6yGg
https://www.flashscore.com/match/f1pz5Fp6
https://www.flashscore.com/match/jTwq3gFI
https://www.flashscore.com/match/QLhJ8cos
https://www.flashscore.com/match/0voW5eVa
https://www.flashscore.com/match/Yiqv4ZaC
https://www.flashscore.com/match/4CiN7H0m
https://www.flashscore.com/match/Sh1CoRqo

Web scraping with selenium to txt

Answers (2)

Solution

Related Questions