Python - How to scrape paginated pages without pagination in URL

Question

Here is sample page:

https://www.ncbi.nlm.nih.gov/pubmed/?term=hg38

it has 40 results. How to get to next page using URL with something like:

https://www.ncbi.nlm.nih.gov/pubmed/?term=hg38**?page=2**

I know how to use scraping libraries (BS4, Selenium) but I don't know how to scrape sites like that. I've been playing with Google Chrome dev tools unsuccessfully.

I know pubmed has API but API doesn't return info that I need (weather article is freely downloadable or not).What's the usual workflow in scraping sites like that in Python?

9716278 · Accepted Answer

The pages are not part of the URL scheme. You should look at the python Selenium driver. With Selenium you can load the page and have your program click buttons on the page to change content on the page, this way you can get to page two on the site, and then continue to scrape the that newly displayed HTML.

Python3 Selenium Driver

Selenium Documentation

Python - How to scrape paginated pages without pagination in URL

Answers (2)

Related Questions