Reputation: 347
I try to scrape specific line from my base_url page (marked blue circle in the picture). Page source code is in the other picture.
My goal is to get those < td > tags, but somehow I can't get them with my code.
My code:
from bs4 import BeautifulSoup
from selenium import webdriver
import requests, csv, re, pandas, numpy
base_url = "http://www.basket.fi/sarjat/ottelu/?game_id=3502579&season_id=93783&league_id=4+"+"#mbt:2-400$t&0=1"
browser = webdriver.PhantomJS()
browser.get(base_url)
table = BeautifulSoup(browser.page_source, 'lxml')
for data in table.find_all("tr",{"class":"row2"}):
print(data.find("td").text)
Upvotes: 2
Views: 1217
Reputation: 15376
Usually you can select html elements by attribute, but for this document the 'class' attribute is not very helpful as there are many other 'tr' tags in the same class.
In that case you could use the list index to select tags.
for td in table.find_all("tr", {"class":"row2"})[25].find_all('td')[1:]:
print(td.get_text(strip=True))
Upvotes: 2