Arturo Sbr
Arturo Sbr

Reputation: 6333

Download excel file from URL and read it with `read_xlsx`

I am trying to download a particularly messy .xlsx file from a URL to a local directory and then read this file using read_xlsx.

# Download file into directory
my_url <- 'https://docs.google.com/spreadsheets/d/0Bw4a10rhk2QqaTZkUmQwaXU4aEE/edit?resourcekey=0-RQa9gRpFX0x3z5bSJGn0Dg#gid=1944035140'
download.file(url=my_url, destfile='./dat/df.xlsx')

# Load file
df <- read_xlsx('./dat/df.xlsx')

This last line throws the following error:

Error: Evaluation error: zip file '/Users/... some path .../dat/df.xlsx' cannot be opened.

I believe this is happening because download.file() is messing up the format somehow. A few other similar issues have been solved, but the solution (mode='wb') did not help.


Could you help me download the file without messing up the format so I can later read this file using read_xlsx?

As an additional request, I would like to use as few external dependencias as possible (that's the reason I tried this with download.file()).

Upvotes: 0

Views: 1812

Answers (1)

Marek Fiołka
Marek Fiołka

Reputation: 4949

Indeed, the link takes you to Google Docs and is for non-downloadable editing. You cannot download this file this way. Just save it to your hard drive. However, I did a function that reads data from a file downloaded to disk. Maybe it will be useful to you.

library(tidyverse)
library(readxl)
urlFile = "https://docs.google.com/spreadsheets/d/1SF0PkBz9BR4yqiQ27Bt5OsD33Y8Rt5lh/edit?usp=sharing&ouid=107152468748636733235&rtpof=true&sd=true"
xlsFile = "refugios_nayarit.xlsx"
download.file(url=urlFile, destfile=xlsFile, mode="wb")

fReadXls = function(xlsFile, sheet) {
  data = read_excel(
  xlsFile, sheet = sheet, skip = 6,
  col_names = c("No.", "REFUGIO", "MUNICIPIO", "DIRECCIÓN", "USO DEL INMUEBLE",
                "SERVICIOS", "CAPACIDAD DE PERSONAS", "COORD. LATITUD L", 
                "COORD. LATITUD W", "COORD. ALTITUD MSNM", "RESPONSABLE", 
                "TELÉFONO"))
  data %>% slice_head(n=nrow(.)-1)
}

df = tibble(sheet = excel_sheets(xlsFile)) %>% 
  mutate(data = map(sheet, ~fReadXls(xlsFile, .x)))

df$data[[1]]

output

# A tibble: 20 x 12
     No. REFUGIO       MUNICIPIO  DIRECCIÓN     `USO DEL INMUEB~ SERVICIOS     `CAPACIDAD DE PE~ `COORD. LATITUD~ `COORD. LATITUD~
   <dbl> <chr>         <chr>      <chr>         <chr>            <chr>                     <dbl> <chr>            <chr>           
 1     1 PRIMARIA LAB~ ACAPONETA  LOPEZ RAYON   EDUCACION        AGUA, SANITA~               200 "22°29'56.06\""  "105°21'37.27\""
 2     2 JARDIN DE NI~ ACAPONETA  ALDAMA ESQ C~ EDUCACION        AGUA, SANITA~               100 "22°29'53.14\""  "105°21'29.48\""
 3     3 PRIMARIA CAR~ ACAPONETA  E. CARRANZA   EDUCACION        AGUA, SANITA~               200 "22o30'00.43\""  "105°21'37.46\""
 4     4 PRIMARIA LAZ~ ACAPONETA  AMADO NERVO   EDUCACION        AGUA, SANITA~               100 "22°29'27.17\""  "105°21'39.68"  
 5     5 PRIMARIA H. ~ ACAPONETA  VERACRUZ No.~ EDUCACION        AGUA, SANITA~               150 "22o29'40.21\""  "105a21'40.23\""
 6     6 PRIMARIA MIG~ ACAPONETA  MORELOS Y OA~ EDUCACION        AGUA, SANITA~               200 "22o29'23.26\""  "105a21'41.99\""
 7     7 PRIMARIA CEN~ ACAPONETA  MATAMOROS Y ~ EDUCACION        AGUA, SANITA~               250 "22o29'37.31\""  "105a21'33.33\""
 8     8 SINDICATO CTM ACAPONETA  QUERETARO Y ~ GREMIO SINDICAL  AGUA, SANITA~               100 "22°29'39.32\""  "105°21'46.60"  
 9     9 ESTADIO MUNI~ ACAPONETA  JUAN ESCUTIA  DEPORTE          AGUA, SANITA~               300 "22a29'55.10\""  "105a21'52.29\""
10    10 CASA DE LA C~ ACAPONETA  MORELOS       CULTURAL         AGUA, SANITA~               300 "22a29'20.78\""  "105a21'46.46\""
11    11 CENTRO RECRE~ ACAPONETA  México ENTRE~ RECREATIVO       AGUA, SANITA~               400 "22a29'39.76\""  "105a21'37.87\""
12    12 CENTRO RECRE~ ACAPONETA  VERACRUZ No15 RECREATIVO       AGUA, SANITA~               400 "22a29'30.03\""  "105a21'39.47\""
13    13 IGLESIA CRIS~ ACAPONETA  VERACRUZ No ~ RELIGIOSO        AGUA, SANITA~                50 "22a29'41.60\""  "105a21'40.35\""
14    14 ESCUELA FRAY~ AHUACATLAN 20 DE NOVIEM~ EDUCACION        AGUA, SANITA~                80 "21a03'06.07\""  "104a29'03.50\""
15    15 SECUNDARIA F~ AHUACATLAN 20 DE NOVIEM~ EDUCACION        AGUA, SANITA~               250 "21a03'18.33\""  "104a28'56.26\""
16    16 ESCUELA JOSE~ AHUACATLAN MORELOS Y MA~ EDUCACION        AGUA, SANITA~               200 "21a03'04.55\""  "104a29'12.67\""
17    17 ESCUELA PREP~ AHUACATLAN CALLE EL SAL~ EDUCACION        AGUA, SANITA~               200 "21a02'57.01\""  "104a29'16.71\""
18    18 ESCUELA PLAN~ AHUACATLAN OAXACA E HID~ EDUCACION        AGUA, SANITA~               200 "21a03'02.43\""  "104a28'58.82\""
19    19 UNIDAD ACADE~ AHUACATLAN CARR A GUADA~ EDUCACION        AGUA, SANITA~               200 "21a03'28.20\""  "104a29'06.67\""
20    20 CLUB SOCIAL ~ AHUACATLAN 20 DE NOVIEM~ DEPORTE          AGUA, SANITA~               400 "21a03'07.37\""  "104a29'01\"57\~
# ... with 3 more variables: COORD. ALTITUD MSNM <dbl>, RESPONSABLE <chr>, TELÉFONO <chr>

Upvotes: 1

Related Questions