Reputation: 385
I am using geopandas sample data for this question.
import geopandas as gpd
df = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
My real dataset is somewhat different containing only 'polygon' type geometry points (in EPSG::4326), but what I would like to do is figure out the area of each polygon for each country in kilometers squared.
I am new to geopandas so I'm not sure if I am doing this right. My process is as follows;
ndf=df
ndf.to_crs("epsg:32633")
ndf["area"] = ndf['geometry'].area/ 10**6
ndf.head(2)
but the resulting areas don't make sense.
So I tried
df_2= df.to_crs({'proj':'cea'})
df_2["area"] = df_2['geometry'].area/ 10**6
df_2.head(2)
which is better, but still not accurate when run a google search for the areas.
So I'm wondering 1) is this the correct method? 2) how do I know the best projection type?
Upvotes: 1
Views: 979
Reputation: 18782
Computing polygon areas on equal-area
types of map projection does not always yield good result due to the requirement of dense vertices along the boundaries of the polygon involved.
Computing on the un-projected earth surface is not difficult. With appropriate Python library that takes great-circle arcs between succeeding vertices that forms the surface areas in the computation, the results are more accurate.
The most accurate (imho) method to compute surface areas on the earth with Python can be demonstrated with this simple code.
import geopandas as gpd
from pyproj import Geod, Proj
# Use the included dataset of Geopandas
df = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
# Prep an ellipsoidal earth (WGS84's parameters)
geod = Geod('+a=6378137 +f=0.0033528106647475126')
# List of countries
some_countries = ["Thailand", "Nepal"]
def area_perim (country_name):
pgon = df[df["name"]==country_name].geometry.iloc[0]
# Extract list of longitude/latitude of country's boundary
lons, lats = pgon.exterior.xy[:][0], pgon.exterior.xy[:][1]
# Compute surface area and perimeter
poly_area, poly_perimeter = geod.polygon_area_perimeter(lons, lats)
# Print the results
print("\nCountry:", country_name)
print("Area, (sq.Km): {:.1f}".format(abs(poly_area)/10**6))
print("Perimeter, (Km): {:.2f}".format(poly_perimeter/10**3))
for each in some_countries:
area_perim(each)
Output:
Country: Thailand Area, (sq.Km): 510125.6 Perimeter, (Km): 5555.56 Country: Nepal Area, (sq.Km): 150706.9 Perimeter, (Km): 1983.42
Note that, df
has CRS = epsg:4326.
If the source geodataframe you use has CRS other than epsg:4326
, you can convert it to epsg:4326 before use.
See reference for more details.
Upvotes: 1