tchoup
tchoup

Reputation: 1023

st_contains matching: CRS Coordinate System Problem / mismatch

I have two dataframes, df_points and df_polygon; I want to match based on points being within a given polygon. However, I am running into a few errors due to mismatch of CRS. This is what I have done:

First, I needed both of these to be sf class; I checked the classes / CRS and converted as follows (running with output):

>class(df_polygon)
[1] "sf"         "data.frame"
>class(df_points)
[1] "data.frame"

>df_points <- st_as_sf(df_points, coords = c("X", "Y"))
>class(df_points)
[1] "sf"         "data.frame"

> st_crs(df_points)
Coordinate Reference System: NA
 st_crs(df_polygon)
Coordinate Reference System:
  User input: NAD83 / North Carolina (ftUS) 
  wkt:
PROJCRS["NAD83 / North Carolina (ftUS)",
    BASEGEOGCRS["NAD83",
        DATUM["North American Datum 1983",
            ELLIPSOID["GRS 1980",6378137,298.257222101,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4269]],
    CONVERSION["SPCS83 North Carolina zone (US Survey feet)",
        METHOD["Lambert Conic Conformal (2SP)",
            ID["EPSG",9802]],
        PARAMETER["Latitude of false origin",33.75,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8821]],
        PARAMETER["Longitude of false origin",-79,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8822]],
        PARAMETER["Latitude of 1st standard parallel",36.1666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8823]],
        PARAMETER["Latitude of 2nd standard parallel",34.3333333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8824]],
        PARAMETER["Easting at false origin",2000000,
            LENGTHUNIT["US survey foot",0.304800609601219],
            ID["EPSG",8826]],
        PARAMETER["Northing at false origin",0,
            LENGTHUNIT["US survey foot",0.304800609601219],
            ID["EPSG",8827]]],
    CS[Cartesian,2],
        AXIS["easting (X)",east,
            ORDER[1],
            LENGTHUNIT["US survey foot",0.304800609601219]],
        AXIS["northing (Y)",north,
            ORDER[2],
            LENGTHUNIT["US survey foot",0.304800609601219]],
    USAGE[
        SCOPE["unknown"],
        AREA["USA - North Carolina"],
        BBOX[33.83,-84.33,36.59,-75.38]],
    ID["EPSG",2264]]

>df_points<- st_transform(df_points, crs = st_crs(df_polygon))
Error in st_transform.sfc(st_geometry(x), crs, ...) : 
  cannot transform sfc object with missing crs

>combine <- st_contains(df_polygon, df_points)
Error in st_geos_binop("contains", x, y, sparse = sparse, prepared = prepared,  : 
  st_crs(x) == st_crs(y) is not TRUE

My understanding at this point is that I have both these files loaded; one was not a spatial file so I made it one (st_as_sf), then I checked the CRS and made them match (st_transform). However, for whatever reason the match failed because the CRS is missing. And therefore, when I run st_contains, the crs don't match and that fails too.

Any advice on what I am doing wrong? I have seen in other answers that I could do a st_transform where I set the crs as a number, but I'm not sure what numbers are valid to use.

Thanks!

[EDIT] adding (some) dput per the comment; both df_points and df_polygons have close to 5.5mil observations, so just posted the end of it; not sure if that is enough to be useful:

dput(df_polygons)


5509796L, 5509797L, 5509798L, 5509799L, 5509800L, 5509801L, 5509802L, 
5509803L, 5509804L, 5509805L, 5509806L, 5509807L, 5509808L), class = c("sf", 
"data.frame"), sf_column = "Shape", agr = structure(c(PARNO = NA_integer_, 
ALTPARNO = NA_integer_, OWNNAME = NA_integer_, IMPROVVAL = NA_integer_, 
LANDVAL = NA_integer_, PARVAL = NA_integer_, MAILADD = NA_integer_, 
MUNIT = NA_integer_, MCITY = NA_integer_, MSTATE = NA_integer_, 
MZIP = NA_integer_, SITEADD = NA_integer_, SUNIT = NA_integer_, 
SCITY = NA_integer_, STNAME = NA_integer_, SZIP = NA_integer_
), .Label = c("constant", "aggregate", "identity"), class = "factor"))

dput(df_points)

5), class = c("XY", 
        "POINT", "sfg")), structure(c(-84.033419, 35.085265), class = c("XY", 
        "POINT", "sfg")), structure(c(-84.033419, 35.085265), class = c("XY", 
        "POINT", "sfg"))), n_empty = 0L, precision = 0, crs = structure(list(
        input = NA_character_, wkt = NA_character_), class = "crs"), bbox = structure(c(xmin = -106.399281, 
    ymin = 26.436531, xmax = -70.927672, ymax = 44.229347), class = "bbox"), class = c("sfc_POINT", 
    "sfc"))), row.names = c(NA, 5447703L), class = c("sf", "data.frame"
), sf_column = "geometry", agr = structure(c(PERSON_ID = NA_integer_, 
MOVE_ID = NA_integer_, address = NA_integer_, city = NA_integer_, 
st = NA_integer_, zip = NA_integer_, oaddress = NA_integer_, 
ocity = NA_integer_, ostate = NA_integer_, ozipcode = NA_integer_, 
MOVE_COUNT_TU = NA_integer_, PERSON_ADDRESS_ID = NA_integer_, 
FULL_ADD = NA_integer_, OFULL_ADD = NA_integer_, ADDRESS_ID_RAW = NA_integer_, 
FULL_ADDR_C = NA_integer_, ADDRESS_ID_CLEAN = NA_integer_, Loc_name = NA_integer_, 
Status = NA_integer_, Score = NA_integer_, Match_type = NA_integer_, 
Match_addr = NA_integer_, Addr_type = NA_integer_), class = "factor", .Label = c("constant", 
"aggregate", "identity")))

Upvotes: 0

Views: 947

Answers (1)

ktiu
ktiu

Reputation: 2636

You can only transform a sf object to a new CRS if its current CRS is known. Your df_points has no CRS associated with it, so the transform fails.

If, for example, df_points was in WSG84 (a reasonable guess, but you should make sure with its source) you could then do

df_points_with_crs <- st_set_crs(df_points, 4326)

before transforming the new (referenced) dataset and combining it with the polygons.

Upvotes: 1

Related Questions