ithoughtso
ithoughtso

Reputation: 163

How to map shapefile polygons to CSV data

I downloaded a polygon shape file (england_ct_1991.shp) from a zip file I downloaded here and CSV file (england_ct_1991.csv) from a zip file here that I want to connect together (all public data). I added a new column to the CSV file called 'price' so each county has a unique price like this:

name,label,x,y,price
Avon,9,360567.1834,171823.554,11
Avon,9,322865.922,160665.4829,11
Bedfordshire,10,506219.5005,242767.306,20
Berkshire,11,464403.02,172809.5331,23....

I joined the shp and CSV by the county name. The problem is the map is not superimposing on the price to show a nice color gradient on the counties based on the price. I checked some YouTube tutorials stating the important part is joining but it worked for them so I am unsure what I did wrong?

library(ggplot2)
library(sf)
library(tidyverse)

# map of england counties
map7 <- read_sf("england_ct_1991.shp")
head(map7)

ggplot(map7) +
geom_sf()

# get x (longitude) y (latitude) county names and prices
totalPrices <- read_csv("england_ct_1991.csv")
head(totalPrices)

# join map and csv data on county name
mappedData <- left_join(map7, totalCounts, by="name")
head(mappedData)

# print map
map1 <- ggplot(mappedData, aes( x=x, y=y, group=name)) +
   geom_polygon(aes(fill=price), color="black") +
   geom_sf()

map1

Upvotes: 0

Views: 329

Answers (1)

John Granger
John Granger

Reputation: 301

The key point is that the warning that Detected an unexpected many-to-many relationship between x and y when running left_join(map7, totalCounts, by="name").

So keep your totalPrices data to be unique, that is, no duplicated regions in name column.

library(ggplot2)
library(sf)
library(tidyverse)

map7 <- read_sf("england_ct_1991.shp")

totalPrices <- read_csv("england_ct_1991.csv")

new <- totalPrices %>%
  group_by(name) %>%
  mutate(price = rnorm(1)) %>% 
  distinct(name, price)
new

## A tibble: 47 × 2
## Groups:   name [47]
#   name                           price
#   <chr>                          <dbl>
# 1 Lincolnshire                 -2.45  
# 2 Cumbria                      -0.413 
# 3 North Yorkshire               0.566 
# 4 Northumberland                0.179 
# 5 Cornwall and Isles of Scilly  1.05  
# 6 Devon                        -0.493 
# 7 Somerset                      0.324 
# 8 Dorset                        0.704 
# 9 East Sussex                   1.32  
#10 Wiltshire                     0.0161
## ℹ 37 more rows
## ℹ Use `print(n = ...)` to see more rows


map7 %>%
  left_join(new) %>%
  ggplot() +
  geom_sf(aes(fill = price),color="black")

Upvotes: 1

Related Questions