Python, using BeautifulSoup parsing values from a table

Question

I am parsing a table in saved .html document, which looks like:

the html codes are like:


                                        15:00:1911.7505392↑
14:56:5511.75017↑
14:56:5211.750479↑
14:56:4911.7406↓
14:56:4611.740333↓
14:56:4311.74021↓
14:56:4011.74015↓
14:56:3711.74035↓
14:56:3411.75011↑
14:56:3111.7403↓
14:56:2811.74024↓
14:56:2211.750291↑
14:56:1911.740198↑
14:56:1611.73015↓

What I have so far is:

list_a = soup.find_all('table')[0].tbody.find_all("tr")

for a in list_a:
    for b in a:
        for c in b:
            for d in c:
                for e in d:
                    print e.renderContents()

even though it doesn't looked very nice, the result is like:

However there are too many contents in the table, I only want the first 10 groups of data in the table. And only the 3rd and 4th items to be put in 2 lists.

i.e.

[“5392”, “17”, “479”, …]

and

[“↑”, “↑”, “↑”, …] #the “↑” can be changed to something else identical if it's a problem

how can I achieve that? Thanks.

Martin Evans · Accepted Answer

The following will extract your two columns using the span tag inside the li elements:

html = """



    
    
    15:00:1911.7505392?
    14:56:5511.75017?
    14:56:5211.750479?
    14:56:4911.7406?
    14:56:4611.740333?
    14:56:4311.74021?
    14:56:4011.74015?
    14:56:3711.74035?
    14:56:3411.75011?
    14:56:3111.7403?
    14:56:2811.74024?
    14:56:2211.750291?
    14:56:1911.740198?
    14:56:1611.73015?
    
    

"""

soup = BeautifulSoup(html)

col_3 = []
col_4 = []

for li in soup.find_all('table')[0].find_all("li"):
    cols = li.find_all("span")
    col_3.append(cols[2].text)
    col_4.append(cols[3].text)

print col_3 
print col_4

This would give you the following output:

[u'5392', u'17', u'479', u'6', u'333', u'21', u'15', u'35', u'11', u'3', u'24', u'291', u'198', u'15']
[u'?', u'?', u'?', u'?', u'?', u'?', u'?', u'?', u'?', u'?', u'?', u'?', u'?', u'?']

Python, using BeautifulSoup parsing values from a table

Answers (2)

Related Questions