Python - BeautifulSoup - Extracting Table Data with tags stuck

Question

In python I am trying to take a table from an HTML file and then store those table attributes in a list so I can then make comparison in table data that is changed. I was able using mechanize to automate the download of the HTML page that was behind a ID\Password login but the second part of placing the data into lists is having the output come out as below with the tags in place. So while it appears I have solved the issue of storing the data, I'm uncertain how to remove the tags prior to passing the data in?

Link to HTML Document: that I am trying to pull data from: https://www.dropbox.com/s/b684ecl7b2l3m10/guildwar.html?dl=0

Sample Output: (TOP PART), code starts at from bs4

[None, None, None,  1 ,  2 ,        3 ]




from bs4 import BeautifulSoup

soup = BeautifulSoup(open("guildwar.html"))

rank_0 = []
color_1 = []
name_2 = []
land_3 = []
fortress_4 = []
power_5 = []


for el in soup.findAll('tr'):
    rank = el.find('td', {'class':'t1'})
    rank_0.append(rank)
    color = el.find('td', {'class':'t2'})
    color_1.append(color)
    name = el.find('td', {'class':'t3'})
    name_2.append(name)
    land = el.find('td', {'class':'t4'})
    land_3.append(land)
    fortress = el.find('td', {'class':'t5'})
    fortress_4.append(fortress)
    power = el.find('td', {'class':'t6'})
    power_5.append(power)

print("Ranking")
print(rank_0)
print("
Magic Color")
print(color_1)
print("
Mage Name")
print(name_2)
print("
Land")
print(land_3)
print("
Fortress")
print(fortress_4)
print("
Power")
print(power_5)

===============================

Python - BeautifulSoup - Extracting Table Data with tags stuck

Answers (1)

Related Questions