Reputation: 15
How to extract Persian texts with Selenium in Python?? The name variable should be text, but it seems to be empty
from selenium import webdriver
from selenium.webdriver.common.by import By
website="https://www.digikala.com/product/dkp-3628808/%D8%B1%D9%88%D8%BA%D9%86-
%D8%B2%DB%8C%D8%AA%D9%88%D9%86-%D8%A8%DB%8C-%D8%A8%D9%88-
%DA%A9%D8%B1%DB%8C%D8%B3%D8%AA%D8%A7%D9%84-%D8%B7%D9%84%D8%A7%DB%8C%DB%8C-3000-
%D9%85%DB%8C%D9%84%DB%8C-%D9%84%DB%8C%D8%AA%D8%B1/"
driver = webdriver.Chrome(executable_path=r"C:\Users\Qazal\anaconda3\Lib\site-
packages\selenium\chromedriver.exe")
driver.get(website)
name=driver.find_element(By.XPATH,'//div[@class="mr-
4"]//a[@href="https://www.digikala.com/seller/AEVNX/"]')
name=name.text
name=name.encode("utf-8")
print(name)
Upvotes: 0
Views: 108
Reputation: 16187
The Xpath expression //*[@class="mr-4"]/div/a/p
is producing the following output:
*emphasized text*from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import time
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service, options=chrome_options)
url='https://www.digikala.com/product/dkp-3628808/%D8%B1%D9%88%D8%BA%D9%86-%20%20%D8%B2%DB%8C%D8%AA%D9%88%D9%86-%D8%A8%DB%8C-%D8%A8%D9%88-%20%20%DA%A9%D8%B1%DB%8C%D8%B3%D8%AA%D8%A7%D9%84-%D8%B7%D9%84%D8%A7%DB%8C%DB%8C-3000-%20%20%D9%85%DB%8C%D9%84%DB%8C-%D9%84%DB%8C%D8%AA%D8%B1/'
driver.get(url)
driver.maximize_window()
time.sleep(5)
for name in driver.find_elements(By.XPATH,'//*[@class="mr-4"]/div/a/p'):
print(name.text)
Output:
سراج احسان
رزاقی
Upvotes: 2