perrycn
perrycn

Reputation: 1

Build Shapely point objects from .TIF

I would like to convert an image (.tiff) into Shapely points. There are 45 million pixels, I need a way to accomplish this without a loop (currently taking 15+ hours)

For example, I have a .tiff file which when opened is a 5000x9000 array. The values are pixel values (colors) that range from 1 to 215. I open tif with rasterio.open(xxxx.tif). Desired epsg is 32615

I need to preserve the pixel value but also attach geospatial positioning. This is to be able to sjoin over a polygon to see if the points are inside. I can handle the transform after processing, but I cannot figure a way to accomplish this without a loop. Any help would be greatly appreciated!

Upvotes: 1

Views: 527

Answers (1)

Michael Delgado
Michael Delgado

Reputation: 15452

If you just want a boolean array indicating whether the points are within any of the geometries, I'd dissolve the shapes into a single MultiPolygon then use shapely.vectorized.contains. The shapely.vectorized module is currently not covered in the documentation, but it's really good to know about!

Something along the lines of

# for a gridded dataset with 2-D arrays lats, lons
# and a list of shapely polygons/multipolygons all_shapes

XX = lons.ravel()
YY = lats.ravel()

single_multipolygon = shapely.ops.unary_union(all_shapes)

in_any_shape = shapely.vectorized.contains(single_multipolygon, XX, YY)

If you're looking to identify which shape the points are in, use geopandas.points_from_xy to convert your x, y point coordinates into a GeometryArray, then use geopandas.sjoin to find the index of the shape corresponding to each (x, y) point:

geoarray = geopandas.points_from_xy(XX, YY)
points_gdf = geopandas.GeoDataFrame(geometry=geoarray)

shapes_gdf = geopandas.GeoDataFrame(geometry=all_shapes)

shape_index_by_point = geopandas.sjoin(
    shapes_gdf, points_gdf, how='right', predicate='contains',
)

This is still a large operation, but it's vectorized and will be significantly faster than a looped solution. The geopandas route is also a good option if you'd like to convert the projection of your data or use other geopandas functionality.

Upvotes: 1

Related Questions