Reputation: 155
I am trying to extract text from a find all function in beautiful soup 4 but I don't know how to do this, here is my current code that is not working. I have read the docs but still do not see or understand anything that is said in it. Here is a minimal reproducible example of my problem:
from bs4 import BeautifulSoup
page = requests.get("https://weather.com/en-IE/weather/tenday/l/e98742cdb581b2f4461e4f438badbfb0e16dc9e70ffbf4c8df1b0f7a4394f9f9")
soup = BeautifulSoup(page.content,"lxml")
info = soup.find_all("span", class_ = "DetailsSummary--extendedData--365A_")
print(info.get_text())
I hope people can help me as soon as they can because this is most likely a very simple question that I don't know the answer to.
Upvotes: 0
Views: 2309
Reputation: 4779
.find_all()
returns a list . So you need to iterate over that list and print the test using .get_text()
.
This way:
for item in info:
print(item.get_text())
Directly get the text using List Comprehension.
import requests
from bs4 import BeautifulSoup
page = requests.get("https://weather.com/en-IE/weather/tenday/l/e98742cdb581b2f4461e4f438badbfb0e16dc9e70ffbf4c8df1b0f7a4394f9f9")
soup = BeautifulSoup(page.content,"lxml")
info = [item.get_text() for item in soup.find_all("span", class_ = "DetailsSummary--extendedData--365A_")]
day = [d.get_text() for d in soup.find_all("h2", class_ = "DetailsSummary--daypartName--2FBp2")]
Gives you
>>> info
['Cloudy', 'Cloudy', 'AM Showers', 'AM Showers', 'Cloudy', 'AM Showers', 'Mostly Cloudy', 'Showers', 'Mostly Cloudy', 'Partly Cloudy', 'PM Showers', 'Showers', 'PM Showers', 'Showers', 'Showers']
>>> day
['Today', 'Sat 13', 'Sun 14', 'Mon 15', 'Tue 16', 'Wed 17', 'Thu 18', 'Fri 19', 'Sat 20', 'Sun 21', 'Mon 22', 'Tue 23', 'Wed 24', 'Thu 25', 'Fri 26']
Upvotes: 1
Reputation: 7675
You need to iterate over the find_all result --> docs
import requests
from bs4 import BeautifulSoup
url = "https://weather.com/en-IE/weather/tenday/l/e98742cdb581b2f4461e4f438badbfb0e16dc9e70ffbf4c8df1b0f7a4394f9f9"
page = requests.get(url)
soup = BeautifulSoup(page.content, "lxml")
info = soup.find_all("span", class_="DetailsSummary--extendedData--365A_")
day = soup.find_all("h2", class_="DetailsSummary--daypartName--2FBp2")
# Find_all if found returns an Iterator
# which you need to loop
info_text = [i.get_text() for i in info]
day_text = [i.get_text() for i in day]
print(info_text)
print(day_text)
Better Print the result
from pprint import pprint
pprint(info_text, indent=4)
pprint(day_text, indent=4)
Upvotes: 1
Reputation: 128
There's just one issue here, info will be a list that you need to iterate over. Using a for you can do
from bs4 import BeautifulSoup
import requests
page = requests.get("https://weather.com/en-IE/weather/tenday/l/e98742cdb581b2f4461e4f438badbfb0e16dc9e70ffbf4c8df1b0f7a4394f9f9")
soup = BeautifulSoup(page.content,"lxml")
info = soup.find_all("span", class_ = "DetailsSummary--extendedData--365A_")
day = soup.find_all("h2", class_ = "DetailsSummary--daypartName--2FBp2")
for item in info:
print(item.get_text())
print()
print()
print(info)
Upvotes: 1