Reputation: 1289
I'm trying to read the following file in R
It works when I paste the url directly in a browser but for some reason when its downloaded via R it cannot be opened.
I have tried download.file() with all its accepted methods ('lib-curl/internal/wininet/...')
u <- 'https://www.ssga.com/us/en/individual/etfs/library-content/products/fund-data/etfs/us/holdings-daily-us-en-xop.xlsx'
download.file(u)
The downloaded file cannot be opened manually nor in R via readxl::read_xlsx() nor gdata::read.xls()
I have also tried the httr package but it seems that content type 'xlsx:application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' is not supported. I did learn however that the file seems to be gzipped but im not even sure if thats the problem
response <- httr:Get(u)
response$headers$`content-encoding`
"gzip"
Can someone suggest how I might be able to download this data within R?
Upvotes: 0
Views: 359
Reputation: 173793
You need to save it as a binary using mode = "wb"
in download.file
:
u <- 'https://www.ssga.com/us/en/individual/etfs/library-content/products/fund-data/etfs/us/holdings-daily-us-en-xop.xlsx'
tmp <- tempfile()
download.file(u, tmp, mode = "wb")
readxl::read_xlsx(tmp)
#> # A tibble: 74 x 8
#> `Fund Name:` `SPDR® S&P® Oil & Gas~ X__1 X__2 X__3 X__4 X__5 X__6
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Ticker Symbol: XOP <NA> <NA> <NA> <NA> <NA> <NA>
#> 2 Holdings: As of 17-Jan-2020 <NA> <NA> <NA> <NA> <NA> <NA>
#> 3 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
#> 4 Name Ticker Ident~ SEDOL Weight Sector Shar~ Local~
#> 5 Apache Corporat~ APA 03741~ 20439~ 3.668~ Oil & Ga~ 2896~ USD
#> 6 Occidental Petr~ OXY 67459~ 26554~ 2.913~ Integrat~ 1639~ USD
#> 7 WPX Energy Inc. WPX 98212~ B40PC~ 2.784~ Oil & Ga~ 5668~ USD
#> 8 Concho Resource~ CXO 20605~ B1YWR~ 2.742~ Oil & Ga~ 7852~ USD
#> 9 EOG Resources I~ EOG 26875~ 23180~ 2.709~ Oil & Ga~ 8191~ USD
#> 10 Hess Corporation HES 42809~ 20237~ 2.689~ Oil & Ga~ 9922~ USD
#> # ... with 64 more rows
Upvotes: 2