Reputation: 6891
Im trying to build a html table that only contains the table header and the row that is relevant to me. The site I'm using is http://wolk.vlan77.be/~gerben.
I'm trying to get the the table header and my the table entry so I do not have to look each time for my own name.
What I want to do :
What I am doing now :
pass this array to a method that generates a string that can be printed as html page
def downloadURL(self): global input filehandle = self.urllib.urlopen('http://wolk.vlan77.be/~gerben') input = '' for line in filehandle.readlines(): input += line filehandle.close()
def soupParserToTable(self,input):
global header
soup = self.BeautifulSoup(input)
header = soup.first('tr')
tableInput='0'
table = soup.findAll('tr')
for line in table:
print line
print '\n \n'
if '''lucas''' in line:
print 'true'
else:
print 'false'
print '\n \n **************** \n \n'
I want to get the line from the html file that contains lucas, however when I run it like this I get this in my output :
****************
<tr><td>lucas.vlan77.be</td> <td><span style="color:green;font-weight:bold">V</span></td> <td><span style="color:green;font-weight:bold">V</span></td> <td><span style="color:green;font-weight:bold">V</span></td> </tr>
false
Now I don't get why it doesn't match, the string lucas is clearly in there :/ ?
Upvotes: 0
Views: 2339
Reputation: 141878
It looks like you're over-complicating this.
Here's a simpler version...
>>> import BeautifulSoup
>>> import urllib2
>>> html = urllib2.urlopen('http://wolk.vlan77.be/~gerben')
>>> soup = BeautifulSoup.BeautifulSoup(html)
>>> print soup.find('td', text=lambda data: data.string and 'lucas' in data.string)
lucas.vlan77.be
Upvotes: 3
Reputation: 2393
It's because line is not a string, but BeautifulSoup.Tag instance. Try to get td value instead:
if '''lucas''' in line.td.string:
Upvotes: 1