R: Extracting HTML's from a List

Question

I am working with the R programming language. I have a list that contains HTTP links (amongst other things) and looks something like this:

    library(rvest)
    library(httr)
    library(XML)

    url<-"mywebsite.com"
    page <-read_html(url)
    links1 = page %>% html_nodes("li")

head(links1)

{xml_nodeset (393)}
 [3] 
 [6] Home

 [7] L ...
 [9] 

Local Listings
< ...
[10]


I want to extract every URL contained in this list - I think these are stored in the "href" part of the list. I tried different ways to do this - but in the end, I figured out a slightly different way of doing this:
# source: https://www.geeksforgeeks.org/extract-all-the-urls-from-the-webpage-using-r-language/

# making http request
resource <- GET(url)

# converting all the data to HTML format
parse <- htmlParse(resource)

# scrapping all the href tags
links2 <- xpathSApply(parse, path="//a", xmlGetAttr, "href")

# printing links
print(links2)

My Question: I would have thought there might be someway to extract the links from "links1" instead of having to approach this problem from a different method as I did with "links2". Can someone please show me how I would have extracted the URL links from "links1"?
Thanks!

R: Extracting HTML's from a List

Answers (1)

Related Questions

R: Extracting HTML&#39;s from a List

Answers (1)

Related Questions

R: Extracting HTML's from a List