MultiPurposeEraser
MultiPurposeEraser

Reputation: 237

How can I get data from a specific cell in an HTML table in python?

This link contains the table I'm trying to parse. I'm trying to use BeautifulSoup in Python. I'm very new to BeautifulSoup and HTML. This is my attempt to solve my problem.

soup = BeautifulSoup(open('BBS_student_grads.php'))

data = []
table = soup.find('table')
rows = table.find_all('tr') #array of rows in table 

for x,row in enumerate(rows[1:]):# skips first row 
    cols = row.find_all('td')    # finds all cols in rows
    for y,col in enumerate(cols): # iterates through col
        data.append([])
        data[x].append(col)       # puts table into a 2d array called data

print(data[0][0])                 #prints top left corner

Sample Output

I'm trying to extract all the names in the table, then update the names in the list and then update the table. I'm also using a local copy of this HTML. Temporary fix till I learn how to do more web programming.

help is much appreciated

Upvotes: 1

Views: 2411

Answers (1)

alecxe
alecxe

Reputation: 473853

I think you need just the td elements in the tr element with class="searchbox_black".

You can use CSS Selectors to get to the desired td elements:

for cell in soup.select('tr.searchbox_black td'):
    print cell.text

It prints:

BB Salsa

 Adams State University Alamosa, CO               
              Sensei: Oneyda Maestas               
              Raymond Breitstein               

...

Upvotes: 1

Related Questions