Reputation: 3515
I want to analyze some earthquake data in R. A typical (of many) output in source HTML would be
<pre>
Year,Month,Day,Time(hhmmss.mm)UTC,Latitude,Longitude,Magnitude,Depth,Catalog
2012, 01, 01, 003008.77, 12.008, 143.487, 5.1, 35, PDE-W
.....
</pre>
I have managed to get the comma-separated-data into a character string where \n should seperate rows but am not clear how to proceed further - and am not sure that is best approach anyway.
library(XML)
url <- "http://neic.usgs.gov/cgi-bin/epic/epic.cgi?SEARCHMETHOD=1&FILEFORMAT=6&SEARCHRANGE=HH&SYEAR=2012&SMONTH=01&SDAY=01&EYEAR=2012&EMONTH=1&EDAY=31&LMAG=4&UMAG=&NDEP1=&NDEP2=&IO1=&IO2=&CLAT=0.0&CLON=0.0&CRAD=0.0&SUBMIT=Submit+Search"
data <- xpathSApply(basicInfo, "//*/pre/text()", xmlValue)
str(data) #chr "\n Year,Month,Day, .... Catalog\n 2012,
Any help appreciated
Upvotes: 1
Views: 229
Reputation: 121578
data.df <- read.table(text = data, fill=TRUE, sep = ',',header=TRUE)
and you get
head(data.df)
Year Month Day Time.hhmmss.mm.UTC Latitude Longitude Magnitude Depth Catalog
1 2012 1 1 3008.77 12.008 143.487 5.1 35 PDE-W
2 2012 1 1 4342.77 12.014 143.536 4.4 35 PDE-W
3 2012 1 1 5008.04 -11.366 166.218 5.3 67 PDE-W
4 2012 1 1 12207.66 -6.747 130.007 4.2 145 PDE-W
5 2012 1 1 23521.11 23.472 91.834 4.6 27 PDE-W
6 2012 1 1 24036.40 6.677 -73.110 4.0 158 PDE-W
Upvotes: 3