Howli
Howli

Reputation: 12469

Replace text in python with BeautifulSoup

I am trying to parse a table with beautifulSoup and remove the blank spaces found in some rows with - so instead of

<tr>
<td><small>15</small></td>
<td><small><small>Cat</small></small></td>
</tr>
<tr>
<td><small><small>   </small></small></td>
<td><small><small> </small></small></td>
</tr>

I want

<tr>
<td><small>15</small></td>
<td><small><small>Cat</small></small></td>
</tr>
<tr>
<td><small><small>-</small></small></td>
<td><small><small>-</small></small></td>
</tr>

I have kind of managed to do this with:

from bs4 import BeautifulSoup

soup = BeautifulSoup (open("table.html"))

for a in soup.findAll('small'):
    a.replaceWith("-")

That does remove the space, but it also removes the text 15 and cat (I know what I have replaces everything in the tag). That is as far as I have been able to get. How can I fix that code so it will only replace the space with -?

EDIT: Sorry here is the raw code

<tr>
<td><small>15</small></td >
<td><small><small>&nbsp;</small></small></td >
</tr>
<tr>
<td><small><small>&nbsp; &nbsp;</small></small></td >
<td><small><small>&nbsp;</small></small></td >
</tr>

Upvotes: 2

Views: 11749

Answers (1)

mortymacs
mortymacs

Reputation: 3736

Try it:

from BeautifulSoup import BeautifulSoup as bs
soup = bs(open("table.html"))
for i in soup.findAll('small'):
    if i.text == "" or "&nbsp;" in i.text:
        i.string = '-'
print soup

you need to check the value before replacing.

Upvotes: 5

Related Questions