Reputation: 35
I am trying to import a table from a website by scraping it by copying the xpath of the html code and using the rvest package. I have done this successfully multiple times before, but when I am trying it now I am merely producing an empty list. In an attempt to diagnose my problem, I ran the following code (taken from https://www.r-bloggers.com/using-rvest-to-scrape-an-html-table/). However, this code is also producing an empty list for me.
Thanks in advance for the help!
library(rvest)
url <- "http://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population"
population <- url %>%
read_html() %>%
html_nodes(xpath='//*[@id="mw-content-text"]/table[1]') %>%
html_table()
Upvotes: 1
Views: 847
Reputation: 206167
Your xpath query is wrong. The table is not a direct child of the node with an id of mw-content-text. It is a descendant though. Try
html_nodes(xpath='//*[@id="mw-content-text"]//table[1]')
Web scraping is a very fragile endeavor and can easily break when websites change their HTML.
Upvotes: 3