lindre3000
lindre3000

Reputation: 35

rvest returning empty list

I am trying to import a table from a website by scraping it by copying the xpath of the html code and using the rvest package. I have done this successfully multiple times before, but when I am trying it now I am merely producing an empty list. In an attempt to diagnose my problem, I ran the following code (taken from https://www.r-bloggers.com/using-rvest-to-scrape-an-html-table/). However, this code is also producing an empty list for me.

Thanks in advance for the help!

library(rvest)
url <- "http://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population"
population <- url %>%
  read_html() %>%
  html_nodes(xpath='//*[@id="mw-content-text"]/table[1]') %>%
  html_table()

Upvotes: 1

Views: 847

Answers (1)

MrFlick
MrFlick

Reputation: 206167

Your xpath query is wrong. The table is not a direct child of the node with an id of mw-content-text. It is a descendant though. Try

html_nodes(xpath='//*[@id="mw-content-text"]//table[1]') 

Web scraping is a very fragile endeavor and can easily break when websites change their HTML.

Upvotes: 3

Related Questions