Alex Zel
Alex Zel

Reputation: 688

Beautifulsoup, unable to compare strings

i'm trying to write a web spider to gather me some links and text. I have a table i'm working with and the second cell of each row has a number in it, all i want to do is get that number, if it's the one i need then grab the links and text in cell 2&4.

Everything works fine except that i can't seem to be able to compare the numbers from the cell to a list of numbers i have.

I get the number using cells[1].get_text() (i create a list of all the cells for each row), this works fine and the type() returns 'class 'str'', i also make sure to convert my numbers list to string.

But when i try to compare them it always returns 'False'

import bs4

file = open(r"some html file", 'rb')
rng_lst = [str(x) for x in range(5, 43)]


soup = bs4.BeautifulSoup(file)

table = soup.findAll('table')[0]
for row in table.findAll('tr'):
    cells = row.findAll('td')
    if len(cells) >= 6:
        check = cells[1].get_text()
        for n in rng_lst:
            if n == check:
                # do stuff

I've tried everything i can think of and i ALWAYS get 'False', using == or 'is' doesn't work, if i try using 'in' it does work but then if i need cell number 5 i can get 15 or 25 also.

Upvotes: 0

Views: 1002

Answers (1)

alecxe
alecxe

Reputation: 473863

Most likely, you just need to strip the text you are getting from a cell:

check = cells[1].get_text(strip=True)

It is still a guess, but an "educated" one.

Upvotes: 2

Related Questions