GCIreland
GCIreland

Reputation: 155

How to get text from Find_All Beautiful Soups 4

I am trying to extract text from a find all function in beautiful soup 4 but I don't know how to do this, here is my current code that is not working. I have read the docs but still do not see or understand anything that is said in it. Here is a minimal reproducible example of my problem:

from bs4 import BeautifulSoup
page = requests.get("https://weather.com/en-IE/weather/tenday/l/e98742cdb581b2f4461e4f438badbfb0e16dc9e70ffbf4c8df1b0f7a4394f9f9")
soup = BeautifulSoup(page.content,"lxml")
info = soup.find_all("span", class_ = "DetailsSummary--extendedData--365A_")

print(info.get_text())



I hope people can help me as soon as they can because this is most likely a very simple question that I don't know the answer to.

Upvotes: 0

Views: 2309

Answers (3)

Ram
Ram

Reputation: 4779

.find_all() returns a list . So you need to iterate over that list and print the test using .get_text().

This way:

for item in info:
    print(item.get_text())

Directly get the text using List Comprehension.

import requests
from bs4 import BeautifulSoup
page = requests.get("https://weather.com/en-IE/weather/tenday/l/e98742cdb581b2f4461e4f438badbfb0e16dc9e70ffbf4c8df1b0f7a4394f9f9")
soup = BeautifulSoup(page.content,"lxml")

info = [item.get_text() for item in soup.find_all("span", class_ = "DetailsSummary--extendedData--365A_")]
day = [d.get_text() for d in soup.find_all("h2", class_ = "DetailsSummary--daypartName--2FBp2")]

Gives you

>>> info
['Cloudy', 'Cloudy', 'AM Showers', 'AM Showers', 'Cloudy', 'AM Showers', 'Mostly Cloudy', 'Showers', 'Mostly Cloudy', 'Partly Cloudy', 'PM Showers', 'Showers', 'PM Showers', 'Showers', 'Showers']
>>> day
['Today', 'Sat 13', 'Sun 14', 'Mon 15', 'Tue 16', 'Wed 17', 'Thu 18', 'Fri 19', 'Sat 20', 'Sun 21', 'Mon 22', 'Tue 23', 'Wed 24', 'Thu 25', 'Fri 26']

Upvotes: 1

Federico Baù
Federico Baù

Reputation: 7675

You need to iterate over the find_all result --> docs

import requests
from bs4 import BeautifulSoup

url = "https://weather.com/en-IE/weather/tenday/l/e98742cdb581b2f4461e4f438badbfb0e16dc9e70ffbf4c8df1b0f7a4394f9f9"
page = requests.get(url)
soup = BeautifulSoup(page.content, "lxml")

info = soup.find_all("span", class_="DetailsSummary--extendedData--365A_")
day = soup.find_all("h2", class_="DetailsSummary--daypartName--2FBp2")

# Find_all if found returns an Iterator
# which you need to loop
info_text = [i.get_text() for i in info]
day_text = [i.get_text() for i in day]

print(info_text)
print(day_text)

Better Print the result

from pprint import pprint

pprint(info_text, indent=4)
pprint(day_text, indent=4)

Upvotes: 1

Alex V
Alex V

Reputation: 128

There's just one issue here, info will be a list that you need to iterate over. Using a for you can do

from bs4 import BeautifulSoup
import requests
page = requests.get("https://weather.com/en-IE/weather/tenday/l/e98742cdb581b2f4461e4f438badbfb0e16dc9e70ffbf4c8df1b0f7a4394f9f9")
soup = BeautifulSoup(page.content,"lxml")
info = soup.find_all("span", class_ = "DetailsSummary--extendedData--365A_")
day = soup.find_all("h2", class_ = "DetailsSummary--daypartName--2FBp2")
for item in info:
    print(item.get_text())
print()
print()
print(info)

Upvotes: 1

Related Questions