Python: Extract text from website that is not in the raw HTML

Question

I have a situation where I am scraping data from webpages and need to store that data (a bunch of strings) in a txt file. I already have the code written to do this for many websites, however I have a roadblock where BeautifulSoup does not seem to work.

Take this website for example: http://www.vucommodores.com/gametracker/launch/gt_mbasebl.html?event=1530990&school=vand&sport=mbasebl&camefrom=&startschool=&

I want to be able to click on the play-by-play button and then extract the text from the 1st inning, 2nd inning, etc. Is anyone aware of a method to do so, because the text is not available in the raw HTML as has been the case with all of my other examples.

Thanks!

Lgiro · Accepted Answer

I don't think this is what BeautifulSoup is meant for. You can use Selenium for Python to interact with the page as if from a browser, and simulate the click. Then extract from the html.

Python: Extract text from website that is not in the raw HTML

Answers (2)

Related Questions