Juan_Got
Juan_Got

Reputation: 13

R: How can I read a browser-based csv?

I am trying to import data in R using an API which is in a CSV format. The url is https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01288PM/csv

What I tried is:

url <- "https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01288PM/csv"
data <- read.csv(url)

But this does not yield the intended results. The dataframe picks up the all values as a header. I think the problem is that url does not download an actual csv file. Instead, the API generates a browser based table in CSV format. This can be verified by trying the url.

Any thoughts on how to import this correctly? Thanks in advance.

Upvotes: 1

Views: 113

Answers (2)

Nicol&#225;s Velasquez
Nicol&#225;s Velasquez

Reputation: 5898

Juán, welcome to StackOverflow.

If you want to read it as CSV, go with r2evans answer. Yet, Perú's Central Bank's API is quite robust with other formats. I would suggest their HTML tables saved as XLS (i.e. "MS Excel"), and JSON formats.

html table with XLS file type:

library(rvest)
xls_url <- "https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01288PM/xls"

table_xls <- html_table(read_html(xls_url))
table_xls
           IPC Sin Alimentos
1  Mar.2019          127.1504
2  Abr.2019          127.2806
3  May.2019          127.4804
4  Jun.2019          127.5056
...

json with nested table

library(jsonlite)
json_url <- "https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01288PM/json"

json_object <- fromJSON(txt = json_url)
table_json <- json_object$periods

table_json
       name           values
1  Mar.2019 127.150411989837
2  Abr.2019 127.280597562152
3  May.2019 127.480400572476
...

Upvotes: 1

r2evans
r2evans

Reputation: 160447

Read it in as text, split on <br>, then parse as CSV:

txt <- strsplit(readLines("https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PN01288PM/csv"), "<br>")[[1]]
txt[1:2]
# [1] "Mes/A&ntilde;o,\"&Iacute;ndice de precios Lima Metropolitana (&iacute;ndice 2009 = 100) - IPC Sin Alimentos\""
# [2] "\"Mar.2019\",\"127.150411989837\""                                                                            

out <- read.csv(text = txt)
str(out)
# 'data.frame': 25 obs. of  2 variables:
#  $ Mes.A.ntilde.o                                                                             : chr  "Mar.2019" "Abr.2019" "May.2019" "Jun.2019" ...
#  $ X.Iacute.ndice.de.precios.Lima.Metropolitana...iacute.ndice.2009...100....IPC.Sin.Alimentos: num  127 127 127 128 128 ...

(Not the best column names, but ... it's a start.)

(You may want to suggest to the admins there that they may not correctly understand the distinction between viewable CSV-like text in a browser window and actual CSV data.)

Upvotes: 2

Related Questions