Reputation: 660
I'm trying to import the data on this website, but it simply isn't working. It's a simple HTML table, and so should be amenable to the readHTMLTable
function in XML. Please advise.
require(XML)
url = 'https://www.archives.gov/federal-register/electoral-college/allocation.html'
table = readHTMLTable(url,header = T,stringsAsFactors=F)
Upvotes: 0
Views: 508
Reputation: 164
Here is a solution using rvest
package
library(tidyverse)
library(rvest)
read_html("https://www.archives.gov/federal-register/electoral-college/allocation.html") %>% # read the html page
html_nodes("table") %>% # extract nodes which contain a table
.[5] %>% # select the node which contains the relevant table
html_table(trim = T) # extract the table
Upvotes: 1
Reputation: 50718
You can do the following
library(XML)
library(RCurl)
# Read HTML library
URL <- "https://www.archives.gov/federal-register/electoral-college/allocation.html"
lst <- readHTMLTable(getURL(URL))
# Remove NULL elements in lst
lst <- Filter(Negate(is.null), lst)
Upon inspection we see that the main table is element 4 in lst
df <- lst[[4]]
df
# State Number of Electoral Votes
#1 Alabama 9
#2 Alaska 3
#3 Arizona 11
#4 Arkansas 6
#5 California 55
#6 Colorado 9
#7 Connecticut 7
#8 Delaware 3
#9 District of Columbia 3
#10 Florida 29
#11 Georgia 16
#12 Hawaii 4
#13 Idaho 4
#14 Illinois 20
#15 Indiana 11
#16 Iowa 6
#17 Kansas 6
#18 Kentucky 8
#19 Louisiana 8
#20 Maine 4
#21 Maryland 10
#22 Massachusetts 11
#23 Michigan 16
#24 Minnesota 10
#25 Mississippi 6
#26 Missouri 10
#27 Montana 3
#28 Nebraska 5
#29 Nevada 6
#30 New Hampshire 4
#31 New Jersey 14
#32 New Mexico 5
#33 New York 29
#34 North Carolina 15
#35 North Dakota 3
#36 Ohio 18
#37 Oklahoma 7
#38 Oregon 7
#39 Pennsylvania 20
#40 Rhode Island 4
#41 South Carolina 9
#42 South Dakota 3
#43 Tennessee 11
#44 Texas 38
#45 Utah 6
#46 Vermont 3
#47 Virginia 13
#48 Washington 12
#49 West Virginia 5
#50 Wisconsin 10
#51 Wyoming 3
The reason why your approach does not work is because url()
which gets called when using readHTMLTable
with an URL can't download from https. So you need to use RCurl
to download the file first.
Upvotes: 0