Reputation:
I'm trying to follow the basemap tutorial for SST and ice analysis. My input data is different from the data in the example though which comes from netCDF4
as a masked array.
I have a pandas
dataframe like:
val_df[0:5]
Out[47]:
lat lon value
0 0.4 98.7 NaN
1 0.4 98.8 NaN
2 0.4 98.9 0.64
3 0.4 99.0 NaN
4 0.5 98.5 1.23
The lats and lons represent a unique location on a map grid for the data point. To create a sample dataframe you can use the following code:
from itertools import product
import pandas as pd
import numpy as np
locations = np.array([x for x in product([1,2,3],[4,5,6])])
data = np.random.random(len(locations))
val_df = pd.DataFrame({'lat':locations[:,0], 'lon':locations[:,1],
'value':data})
What I have done before is take this dataframe, pivot it (using the built-in pivot function) so that the lat
column is the index and the lon
column is the columns and the values are the values. Then I can use m.imshow
to plot the resulting values.
However this seems to be a poor solution. A) pcolormesh
is more recommended thanimshow
for reasons unclear to me, B) It seems like people typically use meshgrid and then a masked array. However, I'm really unclear as to how to structure my data into the meshgrid and masked array based on the basemap examples since the data in the examples comes pre-shaped.
When I create the mesh_grid of the lats/lons I do the following:
latmin = np.floor(val_df.lat.min())
latmax = np.ceil(val_df.lat.max())
lonmin = np.floor(val_df.lon.min())
lonmax = np.ceil(val_df.lon.max())
lats = np.arange(latmin, latmax, 0.1)
lons = np.arange(lonmin, lonmax, 0.1)
lats_mesh, lons_mesh = np.meshgrid(lats, lons)
But I am unclear at this point how to structure and mask the value
column such that the values appear in the correct location of the mesh grid when I give it to pcolormesh
like so:
from mpl_toolkits.basemap import Basemap
m = Basemap(projection='merc'
, llcrnrlon=lonmin
, llcrnrlat=latmin
, urcrnrlon=lonmax
, urcrnrlat=latmax)
m.drawcoastlines()
m.drawstates()
m.drawcountries()
m.fillcontinents(color='gray', lake_color='white', zorder=0)
m.drawmapboundary(fill_color='white')
pc1 = m.pcolormesh(lons, lats, masked_data, shading='flat', cmap='hot_r', latlon=True)
Upvotes: 4
Views: 5398
Reputation:
Basically, the answer to my question is I needed to pivot my dataframe.
val_pivot_df = val_df.pivot(index='lat', columns='lon', values='b_value')
This pivots the dataframe, fills in the areas where there is no data with NaNs
and returns it like so. Since basemap doesn't like pandas I then output the data as numpy arrays and plot it.
lons = val_pivot_df.columns.values
lats = val_pivot_df.index.values
fig, ax = plt.subplots(1, figsize=(8,8))
m = Basemap(projection='merc',
llcrnrlat=val_df.dropna().min().lat-5
, urcrnrlat=val_df.dropna().max().lat+5
, llcrnrlon=val_df.dropna().min().lon-5
, urcrnrlon=val_df.dropna().max().lon+5
, resolution='i', area_thresh=10000
)
m.drawcoastlines()
m.drawstates()
m.drawcountries()
m.fillcontinents(color='gray', lake_color='white')#, zorder=0)
m.drawmapboundary(fill_color='0.3')
x, y = np.meshgrid(lons,lats)
px,py = m(x,y)
data_values = val_pivot_df.values
masked_data = np.ma.masked_invalid(data_values)
cmap = plt.cm.viridis
m.pcolormesh(px, py, masked_data, cmap=cmap, vmin=0, vmax=2, shading='flat')
m.colorbar()
Upvotes: 3