alyferryhalo
alyferryhalo

Reputation: 43

How to get the <p> tag content?

I would like to get the content from all <p> tags on the web-page so I wrote this code:

from selenium import webdriver

driver = webdriver.Firefox()

href_list = []
href_p_dict = {}

for i in range(1, 11):
    get_link = f"https://rifey.ru/news?page={i}"
    driver.get(get_link)
    e_list = driver.find_elements_by_class_name('block-link')
    for e in e_list:
        href_list.append(e.get_attribute('href'))

for href in href_list:
    driver.get(href)
    content = driver.find_elements_by_tag_name('p')
    href_p_dict.update({href: content})
    print(href, content)

But my output is like this:

https://rifey.ru/news/list/id_102034 [<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="480831d9-04ed-4443-99de-3a7f16ff3c9c")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="c1e11246-d799-4cfd-bdfb-f1e85f3eaeab")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="4cf97837-fa83-466c-a07d-11b9121b314e")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="89807888-63b3-478d-9af8-92d3155d2197")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="0a5b148a-07cb-46eb-bd63-93fa0a2c7339")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="e36ad0f0-7b5d-4781-9a34-b5c4c198fa97")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="080de5d0-dbab-4059-afcd-120b039dc4b8")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="3a1c5678-be15-4205-97e4-7cfcb8267717")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="400ee932-f79f-40a0-acc0-e0e93832cbb5")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="c32ed757-2646-48c0-9542-ae6c26da20ca")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="1602487b-8bab-42ad-84fd-cf9c4a60506e")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="1a745694-5e46-4676-9aed-7623e758697a")>, <selenium.webdriver.firefox.webelement.FirefoxWebElement (session="75246d26-78b9-4f3a-bb8e-0d61b466ed95", element="5f80d5a6-85af-48b4-9dab-b7d8442c826c")>]

I expect to get the text inside the <p>. I have tried to change my code like this:

content = driver.find_elements_by_tag_name('p').text

and this:

href_p_dict.update({href: content.text})

But I have the same traceback:

Traceback (most recent call last):
  File "/home/alyferryhalo/Documents/code/work/rifey_parser.py", line 17, in <module>
    content = driver.find_elements_by_tag_name('p').text
AttributeError: 'list' object has no attribute 'text'

How can I fix it?

I use these:

Upvotes: 1

Views: 106

Answers (1)

cruisepandey
cruisepandey

Reputation: 29362

You can not do

content = driver.find_elements_by_tag_name('p').text

since the moment you use find_elements, it will return a list in Python.

A list does not have a text method. so the error that you have been facing is accurate.

AttributeError: 'list' object has no attribute 'text'

Now to resolve this :

do this :

for con in driver.find_elements_by_tag_name('p')
    print(con.text)

Upvotes: 1

Related Questions