Reputation: 4092
I have been trying to use this question and this tutorial to get the table and links for the list of available rpackages in cran
I got that right doing this:
library(rvest)
page <- read_html("http://cran.r-project.org/web/packages/available_packages_by_name.html") %>% html_node("table") %>% html_table(fill = TRUE, header = FALSE)
When I try to get the links is where I get in trouble, I tried using the selector gadget for the first column of the table (Packages links) and I got the node td a
, so I tried this:
test2 <- read_html("http://cran.r-project.org/web/packages/available_packages_by_name.html") %>% html_node("td a") %>% html_attr("href")
But I only get the first link, then I thought I could get all the href
from the tables and tried the following:
test3 <- read_html("http://cran.r-project.org/web/packages/available_packages_by_name.html") %>% html_node("table") %>% html_attr("href")
but got nothing, what am I doing wrong?
Upvotes: 0
Views: 112
Reputation: 12819
Essentially, an "s" is missing: html_nodes()
is used instead of html_node
:
x <-
read_html(paste0(
"http://cran.r-project.org/web/",
"packages/available_packages_by_name.html"))
html_nodes(x, "td a") %>%
sapply(html_attr, "href")
Upvotes: 1