Augusto
Augusto

Reputation: 1

rvest help: webscraping empty results

I need a help with my webscraping.... can someone save me?

I am trying to get the list of universities in this webpage https://www.whed.net/results_institutions.php for this purpose, I am using the following code:

library(rvest)
library(dplyr)


whed_afg <- "https://www.whed.net/results_institutions.php"
whed_afg1 <- read_html(whed_afg)
whed_afg1
str(whed_afg1)

univ_afg1 = whed_afg1 %>% html_nodes("#results .fancybox\\.iframe") %>% html_text()
univ_afg1

I put double "" on the html_nodes because it was giving me error: Error: '.' is an unrecognized escape in character string starting ""#results .fancybox."

Can someone help me, I do not know what I am doing wrong.

Thank you all, Ricardo

Upvotes: 0

Views: 80

Answers (1)

QHarr
QHarr

Reputation: 84465

I think perhaps you have the wrong start url? Or it is behind a login as I get re-directed with your url. I see a full university list on the following url and with a different class for selecting. These could be split by country of interest.

library(rvest)

url <- "https://www.iau-aiu.net/List-of-IAU-Members?lang=en"
universities <- read_html(url)  %>% html_nodes('.spip_out') %>% html_text()
print(universities)

Upvotes: 1

Related Questions