we need a Mat. Stat.
we need a Mat. Stat.

Reputation: 375

coercing httr2 response to dataframe

I have a api call that yields a list of lists as the response. I would really like to see the response in the form of a dataframe, but cannot get that working by myself.

I wonder if anyone can help?

objectIds <- c(746, 3733, 2271, 2337, 3385, 3610, 2921, 1741, 1883, 885)
snes_url <- "https://services1.arcgis.com/gtRkYQyUQLMTM1uS/ArcGIS/rest/services/Species_of_National_Environmental_Significance_Distributions_(public_grids)/FeatureServer/0/query"

# here is the query, the actual number of objectIds I might require is arbitrary (though <2000)

full_query_params <- list(
  where = glue::glue("OBJECTID IN ({toString(objectIds[1:5])})"),
  outFields = "*",
  outSR = 4326,
  f = "json"
)

# set the request
dat_req <-request(snes_url)
# Add the query parameters using req_url_query
dat_req <- dat_req |> req_url_query(!!!full_query_params)
# Perform the request and get the response
my_response <- dat_req |> 
  req_perform() |>
  resp_body_json()


bb <- my_response |> pluck('features') 
bb[[1]]$attributes  # these are the data I want in a dataframe, one row per objectID (or in this example, one row per bb list item, bb[[1]], bb[[22]], bb[[3]]), one column per field.


# this is one of the methods I have tried for creating a dataframe, but not got to work.
extracted_data <- my_response |>
  pluck('features') |> 
  map_dfr( # iterates over each list and binds rows to a tibble
    \(x) {
      tibble(
        OBJECTID = x |> pluck('OBJECTID'),
        SCIENTIFIC_NAME = x |> pluck('SCIENTIFIC_NAME  '),
        VERNACULAR_NAME = x |> pluck('VERNACULAR_NAME'),
        TAXON_GROUP = x |> pluck('TAXON_GROUP')
      )
    }
  )
extracted_data

I was more or less following this guide: https://albert-rapp.de/posts/web_dev/07_httr2_ultimate_guide/07_httr2

Upvotes: 0

Views: 334

Answers (3)

Martin Morgan
Martin Morgan

Reputation: 46856

Following @grzegorz-sapijaszko's suggestion to manipulate the JSON string directly, but since you're only interested in a few fields, use CRAN rjsoncons to extract just those of interest using JMESpath syntax

query <- 'features[].attributes.{
    OBJECTID: OBJECTID,
    SCIENTIFIC_NAME: SCIENTIFIC_NAME,
    VERNACULAR_NAME: VERNACULAR_NAME,
    TAXON_GROUP: TAXON_GROUP
}'

req <-
    httr2::request(snes_url) |>
    httr2::req_url_query(!!!full_query_params)

json <-
    req |> 
    httr2::req_perform() |>
    httr2::resp_body_string()

json |>
    rjsoncons::j_pivot(query, as = "tibble")
## # A tibble: 5 × 4
##   OBJECTID SCIENTIFIC_NAME                 VERNACULAR_NAME            TAXON_GROUP
##      <int> <chr>                           <chr>                      <chr>      
## 1      746 Pterostylis cucullata           Leafy Greenhood            flora      
## 2     2271 Phaethon lepturus               White-tailed Tropicbird    birds      
## 3     2337 Lichenostomus melanops cassidix Helmeted Honeyeater, Yell… birds      
## 4     3385 Caladenia melanema              Ballerina Orchid, Balleri… flora      
## 5     3733 Anoxypristis cuspidata          Narrow Sawfish, Knifetoot… fishes     

Alternatively, since the query involves a simple URL, use it directly (without httr2)

url <- req$url
j_pivot(url, query, as = "tibble")

Upvotes: 1

Captain Hat
Captain Hat

Reputation: 3237

You can concisely wrangle my_response using map() and bind_rows() from purrr and dplyr respectively:

library(purrr)
library(dplyr)

map(my_response$features, "attributes") |> 
  bind_rows()

Note that when you pass a length-1 character vector to map(), it is plucked from each element of .x. That is, map(x, "foo") is equivalent to map(x, \(z) pluck(z, "foo"))

Upvotes: 0

Grzegorz Sapijaszko
Grzegorz Sapijaszko

Reputation: 3604

Don't parse geojson/json to list. Rather use it as an object. Below an example with {jsonlite}

objectIds <- c(746, 3733, 2271, 2337, 3385, 3610, 2921, 1741, 1883, 885)
snes_url <- "https://services1.arcgis.com/gtRkYQyUQLMTM1uS/ArcGIS/rest/services/Species_of_National_Environmental_Significance_Distributions_(public_grids)/FeatureServer/0/query"

# here is the query, the actual number of objectIds I might require is arbitrary (though <2000)

full_query_params <- list(
  where = glue::glue("OBJECTID IN ({toString(objectIds[1:5])})"),
  outFields = "*",
  outSR = 4326,
  f = "json"
)

# set the request
dat_req <-httr2::request(snes_url)
# Add the query parameters using req_url_query
dat_req <- dat_req |> httr2::req_url_query(!!!full_query_params)
# Perform the request and get the response
my_response <- dat_req |> 
  httr2::req_perform() |>
  httr2::resp_body_string() |>
  jsonlite::fromJSON()

my_response$features$attributes |>
  head()
#>   OBJECTID LISTED_TAXON_ID MAP_TAXON_ID                 SCIENTIFIC_NAME
#> 1      746           15459        15459           Pterostylis cucullata
#> 2     2271            1014         1014               Phaethon lepturus
#> 3     2337           26011        26011 Lichenostomus melanops cassidix
#> 4     3385           65295        65295              Caladenia melanema
#> 5     3733           68448        68448          Anoxypristis cuspidata
#>                                            VERNACULAR_NAME
#> 1                                          Leafy Greenhood
#> 2                                  White-tailed Tropicbird
#> 3 Helmeted Honeyeater, Yellow-tufted Honeyeater (Helmeted)
#> 4                Ballerina Orchid, Ballerina Spider Orchid
#> 5                       Narrow Sawfish, Knifetooth Sawfish
#>       THREATENED_STATUS MIGRATORY_STATUS MARINE CETACEAN PRESENCE_RANK
#> 1            Vulnerable             <NA>   <NA>       NA             1
#> 2                  <NA>        Migratory Listed       NA             2
#> 3 Critically Endangered             <NA>   <NA>       NA             1
#> 4 Critically Endangered             <NA>   <NA>       NA             1
#> 5                  <NA>        Migratory   <NA>       NA             2
#>                            PRESENCE_CATEGORY EXTRACT_DATE TAXON_GROUP
#> 1       Species or species habitat may occur 1.707955e+12       flora
#> 2 Species or species habitat likely to occur 1.707955e+12       birds
#> 3       Species or species habitat may occur 1.707955e+12       birds
#> 4       Species or species habitat may occur 1.707955e+12       flora
#> 5 Species or species habitat likely to occur 1.707955e+12      fishes
#>    TAXON_FAMILY    TAXON_ORDER    TAXON_CLASS  TAXON_PHYLUM TAXON_KINGDOM
#> 1   Orchidaceae     Orchidales     Liliopsida Magnoliophyta       Plantae
#> 2 Phaethontidae Pelecaniformes           Aves      Chordata      Animalia
#> 3  Meliphagidae  Passeriformes           Aves      Chordata      Animalia
#> 4   Orchidaceae     Orchidales     Liliopsida Magnoliophyta       Plantae
#> 5     Pristidae     Rajiformes Chondrichthyes      Chordata      Animalia
#>                 OTHER_IDS CELL_SIZE           REGIONS
#> 1 36179,83014,83015,85883       0.1 NSW; TAS; VIC; SA
#> 2                    <NA>      0.01 CI; ACI; CKI; CMA
#> 3                    <NA>      0.01               VIC
#> 4                    <NA>      0.01                WA
#> 5                    <NA>      0.01       NT; QLD; WA
#>                                                                                                                                                                                  ATTRIBUTION
#> 1 Species of National Environmental Significance 10km Grids February 2024 : Department of Climate Change, Energy, the Environment and Water, Canberra. Copyright: Commonwealth of Australia.
#> 2  Species of National Environmental Significance 1km Grids February 2024 : Department of Climate Change, Energy, the Environment and Water, Canberra. Copyright: Commonwealth of Australia.
#> 3  Species of National Environmental Significance 1km Grids February 2024 : Department of Climate Change, Energy, the Environment and Water, Canberra. Copyright: Commonwealth of Australia.
#> 4  Species of National Environmental Significance 1km Grids February 2024 : Department of Climate Change, Energy, the Environment and Water, Canberra. Copyright: Commonwealth of Australia.
#> 5  Species of National Environmental Significance 1km Grids February 2024 : Department of Climate Change, Energy, the Environment and Water, Canberra. Copyright: Commonwealth of Australia.
#>                                                                        SPRAT_PROFILE
#> 1 http://www.environment.gov.au/cgi-bin/sprat/public/publicspecies.pl?taxon_id=15459
#> 2  http://www.environment.gov.au/cgi-bin/sprat/public/publicspecies.pl?taxon_id=1014
#> 3 http://www.environment.gov.au/cgi-bin/sprat/public/publicspecies.pl?taxon_id=26011
#> 4 http://www.environment.gov.au/cgi-bin/sprat/public/publicspecies.pl?taxon_id=65295
#> 5 http://www.environment.gov.au/cgi-bin/sprat/public/publicspecies.pl?taxon_id=68448
#>    SHAPE__Area SHAPE__Length
#> 1 3.979351e+10    4774641.53
#> 2 1.648691e+12   15936995.18
#> 3 1.037232e+08      98139.79
#> 4 3.040163e+09    1104915.68
#> 5 5.942621e+11   51214296.25

Created on 2024-03-18 with reprex v2.1.0

Upvotes: 2

Related Questions