Yebach
Yebach

Reputation: 1691

python beautiful soup extract data

I am parsing a html document using a Beautiful Soup 4.0.

Here is an example of table in document

<tr>
<td class="nob"></td>
<td class="">Time of price</td>
<td class=" pullElement pullData-DE000BWB14W0.teFull">08/06/2012</td>
<td class=" pullElement pullData-DE000BWB14W0.PriceTimeFull">11:43:08&nbsp;</td>
<td class="nob"></td>
</tr>
<tr>
<td class="nob"></td>
<td class="">Daily volume (units)</td>
<td colspan="2" class=" pullElement pullData-DE000BWB14W0.EWXlume">0</td>
                <td class="nob"></td>
<t/r>

I would like to extract 08/06/2012 and 11:43:08  DAily volume, 0 etc.

This is my code to find specific table and all data of it

html = file("some_file.html")
soup = BeautifulSoup(html)
t = soup.find(id="ctnt-2308")
dat = [ map(str, row.findAll("td")) for row in t.findAll("tr") ]

I get a list of data that needs to be organized

Any suggestions to do it in a simple way??

Thank you

Upvotes: 2

Views: 509

Answers (1)

ashish
ashish

Reputation: 2180

list(soup.stripped_strings)

will give you all the string in that soup (removing all trailing spaces)

Upvotes: 1

Related Questions