Oniropolo
Oniropolo

Reputation: 909

Load incomplete .dat file into R

My data comes from this URL and has the following structure:

 93193KFAT FAT2013123016150015   NP [0000  ] 0.00              39999   29.791        
 93193KFAT FAT2013123016160016   NP [0000  ] 0.00              39999   29.791        
 93193KFAT FAT2013123016170017   NP [0000  ]                   39999   29.791        
 93193KFAT FAT2013123016170017   NP [0000  ] 0.00              39999   29.791 

So if you see this:

  1. the data is separated by blank spaces,
  2. there are some column entries missing (ie. the 0.00 in row 3).

When I load this into R it gives me the error that:

 Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :   
 line 377 did not have 12 elements

How do you fix this? So that I can open this from the URL directly without any problems?

Thank you!

 > read.fwf(ftp://ftp.ncdc.noaa.gov/pub/data/asos-onemin/6406-2013/64060KFAT201312.dat)
 Error: unexpected '/' in "read.fwf(ftp:/"
 trying URL 'ftp://ftp.ncdc.noaa.gov/pub/data/asos-onemin/6406-    2013/64060KFAT201312.dat'
 using Synchronous WinInet calls
 Error in download.file(url, downloadPath) : 
 cannot open URL 'ftp://ftp.ncdc.noaa.gov/pub/data/asos-onemin/6406-2013/64060KFAT201312.dat'
 In addition: Warning message:
 In download.file(url, downloadPath) : InternetOpenUrl failed: ''
 Error in download.file(url, downloadPath) : unsupported URL scheme

1) Try url("....). I get the error:

 Error in url("ftp.ncdc.noaa.gov/pub/data/asos-onemin/6406-2013/64060KFAT201312.dat") : 
   unsupported URL scheme

2) I tried using library(RCurl) and do: getURL("...). I get the error:

 Error in file(file, "rt") : cannot open the connection
 In addition: Warning message:
 In file(file, "rt") :
   cannot open file  [... and R shows the data in the url ]

Upvotes: 0

Views: 314

Answers (1)

cory
cory

Reputation: 6659

Something along the lines of this:

a <- read.fwf("ftp://ftp.ncdc.noaa.gov/pub/data/asos-onemin/6406-2013/64060KFAT201312.dat", 
              widths=c(9, 20, 2, 3, 9, 5, 6, 7))

Upvotes: 2

Related Questions