(Python)- How to store text extracted from HTML table using BeautifulSoup in a structured python list

Question

I parse a webpage using beautifulsoup:

import requests
from bs4 import BeautifulSoup 
page = requests.get("webpage url")
soup = BeautifulSoup(page.content, 'html.parser')

I find the table and print the text

Ear_yield= soup.find(text="Earnings Yield").parent
print(Ear_yield.parent.text)

And then I get the output of a single row in a table

Earnings Yield
0.01
-0.59
-0.33
-1.23
-0.11

I would like this output to be stored in a list so that I can print on xls and operate on the elements (For ex if (Earnings Yield [0] > Earnings Yield [1]). So I write:

import html2text
text1 = Ear_yield.parent.text
Ear_yield_text = html2text.html2text(pr1)

list_Ear_yield = []
for i in Ear_yield_text :
list_Ear_yield.append(i)

Thinking that my web data has gone into list. I print the fourth item and check:

print(list_Ear_yield[3])

I expect the output as -0.33 but i get

That means the list takes in individual characters and not the full word: Please let me know where I am doing wrong

Zroq · Accepted Answer

That is because your Ear_yield_text is a string rather than a list. Assuming that the text have new lines you can do directly this:

list_Ear_yield = Ear_yield_text.split('
')

Now if you print list_Ear_yield you will be given this result

['Earnings Yield', '0.01', '-0.59', '-0.33', '-1.23', '-0.11']

(Python)- How to store text extracted from HTML table using BeautifulSoup in a structured python list

Answers (1)

Related Questions