Reputation: 33
How do I scrape the starting pitchers and import to excel?
Here is my code so far.
from urllib.request import urlopen
from lxml import html
response = urlopen("https://www.baseball-reference.com/previews/index.shtml")
content = response.read()
tree = html.fromstring(content)
Upvotes: 0
Views: 94
Reputation: 55002
I'll get you started. First you want to use cssselect unless you like xpath:
import cssselect
Then you just figure out the css for the things you want to iterate:
for div in tree.cssselect('.game_summaries'):
for a in div.cssselect('table:nth-child(2) a'):
print(a.text)
You can find the css from the elements inspector of your browser (chrome is best).
Upvotes: 2