Reputation: 1
I need a help with my webscraping.... can someone save me?
I am trying to get the list of universities in this webpage https://www.whed.net/results_institutions.php for this purpose, I am using the following code:
library(rvest)
library(dplyr)
whed_afg <- "https://www.whed.net/results_institutions.php"
whed_afg1 <- read_html(whed_afg)
whed_afg1
str(whed_afg1)
univ_afg1 = whed_afg1 %>% html_nodes("#results .fancybox\\.iframe") %>% html_text()
univ_afg1
I put double "" on the html_nodes because it was giving me error: Error: '.' is an unrecognized escape in character string starting ""#results .fancybox."
Can someone help me, I do not know what I am doing wrong.
Thank you all, Ricardo
Upvotes: 0
Views: 80
Reputation: 84465
I think perhaps you have the wrong start url? Or it is behind a login as I get re-directed with your url. I see a full university list on the following url and with a different class for selecting. These could be split by country of interest.
library(rvest)
url <- "https://www.iau-aiu.net/List-of-IAU-Members?lang=en"
universities <- read_html(url) %>% html_nodes('.spip_out') %>% html_text()
print(universities)
Upvotes: 1