jason
jason

Reputation: 43

Using API in R and converting to dataframe format

I'm basically trying to call an API to retrieve weather information from a government website.

library(data.table)
library(jsonlite)
library(httr)
base<-"https://api.data.gov.sg/v1/environment/rainfall"
date1<-"2020-01-25"
call1<-paste(base,"?","date","=",date1,sep="")

get_rainfall<-GET(call1)
get_rainfall_text<-content(get_rainfall,"text")
get_rainfall_json <- fromJSON(get_rainfall_text, flatten = TRUE)
get_rainfall_df <- as.data.frame(get_rainfall_json)

I'm getting an error "Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 52, 287, 1"

Not too sure how to resolve this, i'm trying to format the retrieved data into a dataframe format so i can make sense of the readings.

Upvotes: 0

Views: 687

Answers (1)

SharpSharpLes
SharpSharpLes

Reputation: 324

Your "get_rainfall_json" object comes back as a "list". Trying to turn this into a data frame is where you are getting the error. If you specify the "items" object within the list, your error is resolved! (The outcome of this looks like it has some more embedded data within objects... So you'll have to parse through that into a format you're interested in.)

get_rainfall_df <- as.data.frame(get_rainfall_json$items)

Update

In order to loop through the next data frame. Here is one way you could do it. Which loops through each row, extracts the list in each row and turns that into a data frame and appends it to the "df". Then, you are left with one final df with all the data in one place.

library(data.table)
library(jsonlite)
library(httr)
library(dplyr)

base <- "https://api.data.gov.sg/v1/environment/rainfall"
date1 <- "2020-01-25"
call1 <- paste(base, "?", "date", "=", date1, sep = "")

get_rainfall <- GET(call1)
get_rainfall_text <- content(get_rainfall,"text")
get_rainfall_json <- fromJSON(get_rainfall_text, flatten = TRUE)
get_rainfall_df <- as.data.table(get_rainfall_json$items)

df <- data.frame()

for (row in 1:nrow(get_rainfall_df)) {
  new_date <- get_rainfall_df[row, ]$readings[[1]]
  colnames(new_date) <- c("stationid", "value")
  date <- get_rainfall_df[row, ]$timestamp
  new_date$date <- date
  df <- bind_rows(df, new_date)
}

Upvotes: 1

Related Questions