Switchback
Switchback

Reputation: 95

Why i cant extract text with .text in beautifulsoup

When i try to extract all my text in 'th' tags i get this error - ResultSet object has no attribute 'text' and etc. How to trigger text in th tag? Otherwise in 'th' tag i have 'a' tag(which i need to extract text), but if i type 'country.a' i get same result - ResultSet object has no attribute 'a' and etc.

code:

from urllib.request import Request
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = Request('https://en.wikipedia.org/wiki/Template:COVID-19_pandemic_data', headers={'User-Agent': 'Mozilla/5.0'})
# Opening up connection, grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
# html parser
page_soup = soup(page_html, "html.parser")
# grabs table data
containers = page_soup.find("table", {"id": "thetable"})
# grabs country names
country = containers.find_all("th", {"scope":"row"}).text

Upvotes: 1

Views: 32

Answers (1)

Jack Fleeting
Jack Fleeting

Reputation: 24940

If all you need is the table on that page, your code isn't really necessary, just use pandas:

import pandas as pd
table = pd.read_html('https://en.wikipedia.org/wiki/Template:COVID-19_pandemic_data')
table[0]

The output is everything in that table, including notes. From here, just use standard pandas methods to extract what you need.

Upvotes: 2

Related Questions