Reputation: 41
Hi I'm looking to bring a geojson FeatureCollection from the UK's Office of National Statistics API into a data.frame using the httr package.
library(httr)
HealthGeog <- GET("https://opendata.arcgis.com/datasets/f0095af162f749ad8231e6226e1b7e30_0.geojson")
And getting a successful response:
> HealthGeog
Response [https://opendata.arcgis.com/datasets/f0095af162f749ad8231e6226e1b7e30_0.geojson]
Date: 2018-11-21 13:28
Status: 200
Content-Type: application/json; charset=utf-8
Size: 9.59 MB
But being new to working with JSON, not sure how to navigate to the list within the FeatureCollection and load this into a data.frame?
Upvotes: 1
Views: 393
Reputation: 78792
We can use R spatial tools to read it but see the section after this one on why you might not need to:
library(sf)
library(tidyverse)
health_geog_url <- "https://opendata.arcgis.com/datasets/f0095af162f749ad8231e6226e1b7e30_0.geojson"
# don't be one of 'those people' and waste bandwidth that isn't yours:
httr::GET(
url = health_geog_url,
httr::write_disk(basename(health_geog_url)),
httr::progress()
)
health_geog <- st_read(basename(health_geog_url))
## Reading layer `OGRGeoJSON' from data source `/Users/bob/Desktop/f0095af162f749ad8231e6226e1b7e30_0.geojson' using driver `GeoJSON'
## replacing null geometries with empty geometries
## Simple feature collection with 32844 features and 10 fields (with 32844 geometries empty)
## geometry type: GEOMETRYCOLLECTION
## dimension: XY
## bbox: xmin: NA ymin: NA xmax: NA ymax: NA
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
health_geog
## Simple feature collection with 32844 features and 10 fields (with 32844 geometries empty)
## geometry type: GEOMETRYCOLLECTION
## dimension: XY
## bbox: xmin: NA ymin: NA xmax: NA ymax: NA
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
## First 10 features:
## LSOA11CD LSOA11NM CCG18CD CCG18CDH CCG18NM STP18CD
## 1 E01011388 Leeds 019B E38000225 15F NHS Leeds CCG E54000005
## 2 E01011865 Wakefield 042D E38000190 03R NHS Wakefield CCG E54000005
## 3 E01011833 Wakefield 025E E38000190 03R NHS Wakefield CCG E54000005
## 4 E01011390 Leeds 087A E38000225 15F NHS Leeds CCG E54000005
## 5 E01011866 Wakefield 045B E38000190 03R NHS Wakefield CCG E54000005
## 6 E01011834 Wakefield 015A E38000190 03R NHS Wakefield CCG E54000005
## 7 E01011391 Leeds 087B E38000225 15F NHS Leeds CCG E54000005
## 8 E01011867 Wakefield 042E E38000190 03R NHS Wakefield CCG E54000005
## 9 E01011835 Wakefield 012A E38000190 03R NHS Wakefield CCG E54000005
## 10 E01011392 Leeds 087C E38000225 15F NHS Leeds CCG E54000005
## STP18NM LAD18CD LAD18NM FID geometry
## 1 West Yorkshire E08000035 Leeds 1001 GEOMETRYCOLLECTION EMPTY
## 2 West Yorkshire E08000036 Wakefield 1002 GEOMETRYCOLLECTION EMPTY
## 3 West Yorkshire E08000036 Wakefield 1003 GEOMETRYCOLLECTION EMPTY
## 4 West Yorkshire E08000035 Leeds 1004 GEOMETRYCOLLECTION EMPTY
## 5 West Yorkshire E08000036 Wakefield 1005 GEOMETRYCOLLECTION EMPTY
## 6 West Yorkshire E08000036 Wakefield 1006 GEOMETRYCOLLECTION EMPTY
## 7 West Yorkshire E08000035 Leeds 1007 GEOMETRYCOLLECTION EMPTY
## 8 West Yorkshire E08000036 Wakefield 1008 GEOMETRYCOLLECTION EMPTY
## 9 West Yorkshire E08000036 Wakefield 1009 GEOMETRYCOLLECTION EMPTY
## 10 West Yorkshire E08000035 Leeds 1010 GEOMETRYCOLLECTION EMPTY
This seems to be a GeoJSON file with no geometries so that likely means it's really just "data". Many of those opendata.arcgis.com
endpoints also have a CSV option and this ones does:
health_geog_url_csv <- "https://opendata.arcgis.com/datasets/f0095af162f749ad8231e6226e1b7e30_0.csv"
httr::GET(
url = health_geog_url_csv,
httr::write_disk(basename(health_geog_url_csv)),
httr::progress()
)
read_csv(basename(health_geog_url_csv))
## # A tibble: 32,844 x 10
## LSOA11CD LSOA11NM CCG18CD CCG18CDH CCG18NM STP18CD STP18NM LAD18CD
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 E01011388 Leeds 01… E380002… 15F NHS Lee… E54000… West Yo… E08000…
## 2 E01011865 Wakefiel… E380001… 03R NHS Wak… E54000… West Yo… E08000…
## 3 E01011833 Wakefiel… E380001… 03R NHS Wak… E54000… West Yo… E08000…
## 4 E01011390 Leeds 08… E380002… 15F NHS Lee… E54000… West Yo… E08000…
## 5 E01011866 Wakefiel… E380001… 03R NHS Wak… E54000… West Yo… E08000…
## 6 E01011834 Wakefiel… E380001… 03R NHS Wak… E54000… West Yo… E08000…
## 7 E01011391 Leeds 08… E380002… 15F NHS Lee… E54000… West Yo… E08000…
## 8 E01011867 Wakefiel… E380001… 03R NHS Wak… E54000… West Yo… E08000…
## 9 E01011835 Wakefiel… E380001… 03R NHS Wak… E54000… West Yo… E08000…
## 10 E01011392 Leeds 08… E380002… 15F NHS Lee… E54000… West Yo… E08000…
## # ... with 32,834 more rows, and 2 more variables: LAD18NM <chr>,
## # FID <int>
I'd use the CSV option.
Upvotes: 1