How do I scrape the starting pitchers and import to excel?

Question

Here is my code so far.

from urllib.request import urlopen
from lxml import html

response = urlopen("https://www.baseball-reference.com/previews/index.shtml")
content = response.read()
tree = html.fromstring(content)

pguardiario · Accepted Answer

I'll get you started. First you want to use cssselect unless you like xpath:

import cssselect

Then you just figure out the css for the things you want to iterate:

for div in tree.cssselect('.game_summaries'):
  for a in div.cssselect('table:nth-child(2) a'):
    print(a.text)

You can find the css from the elements inspector of your browser (chrome is best).

How do I scrape the starting pitchers and import to excel?

Answers (1)

Related Questions