Retrieve google scholar number of search results by year using R or Python?

Question

I have no idea how to start so I have no code that I tried and I apologize...Is there a way to loop the following url by a sequence of number (year):

https://scholar.google.com/scholar?hl=en&as_sdt=0%2C22&as_ylo=2021&q=%22TERM1%22+AND+%22TERM2%22&btnG=

where 2021 is replace by a sequence and just get the simple number of search results by year?

Thank you so much!

Edit:

This works for Google search but not for Google Scholar...Generates an empty set.

ua <- "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36"
url <- "https://scholar.google.com/scholar?hl=en&as_sdt=0%2C22&as_ylo=2021&q=%22causal+inference%22+AND+%22statistics%22&btnG="
doc <- htmlTreeParse(getURL(url, httpheader = list(`User-Agent` = ua)), useInternalNodes = TRUE)

nodes <- getNodeSet(doc, "//div[@id='result-stats']")
nodes

Retrieve google scholar number of search results by year using R or Python?

Answers (1)

Related Questions