user845888
user845888

Reputation:

restructuring pandas dataframe into meshgrid for basemap

I'm trying to follow the basemap tutorial for SST and ice analysis. My input data is different from the data in the example though which comes from netCDF4 as a masked array.

I have a pandas dataframe like:

val_df[0:5]
Out[47]:
    lat lon     value
0   0.4 98.7    NaN
1   0.4 98.8    NaN
2   0.4 98.9    0.64
3   0.4 99.0    NaN
4   0.5 98.5    1.23

The lats and lons represent a unique location on a map grid for the data point. To create a sample dataframe you can use the following code:

from itertools import product
import pandas as pd
import numpy as np

locations = np.array([x for x in product([1,2,3],[4,5,6])])
data = np.random.random(len(locations))
val_df = pd.DataFrame({'lat':locations[:,0], 'lon':locations[:,1],
                      'value':data})

What I have done before is take this dataframe, pivot it (using the built-in pivot function) so that the lat column is the index and the lon column is the columns and the values are the values. Then I can use m.imshow to plot the resulting values.

However this seems to be a poor solution. A) pcolormesh is more recommended thanimshow for reasons unclear to me, B) It seems like people typically use meshgrid and then a masked array. However, I'm really unclear as to how to structure my data into the meshgrid and masked array based on the basemap examples since the data in the examples comes pre-shaped.

When I create the mesh_grid of the lats/lons I do the following:

latmin = np.floor(val_df.lat.min())
latmax = np.ceil(val_df.lat.max())
lonmin = np.floor(val_df.lon.min())
lonmax = np.ceil(val_df.lon.max())
lats = np.arange(latmin, latmax, 0.1)
lons = np.arange(lonmin, lonmax, 0.1)
lats_mesh, lons_mesh = np.meshgrid(lats, lons)

But I am unclear at this point how to structure and mask the value column such that the values appear in the correct location of the mesh grid when I give it to pcolormesh like so:

from mpl_toolkits.basemap import Basemap
m = Basemap(projection='merc'
           , llcrnrlon=lonmin
           , llcrnrlat=latmin
           , urcrnrlon=lonmax
           , urcrnrlat=latmax)
m.drawcoastlines()
m.drawstates()
m.drawcountries()
m.fillcontinents(color='gray', lake_color='white', zorder=0)
m.drawmapboundary(fill_color='white')

pc1 = m.pcolormesh(lons, lats, masked_data, shading='flat', cmap='hot_r', latlon=True)

Upvotes: 4

Views: 5398

Answers (1)

user845888
user845888

Reputation:

Basically, the answer to my question is I needed to pivot my dataframe.

val_pivot_df = val_df.pivot(index='lat', columns='lon', values='b_value')

This pivots the dataframe, fills in the areas where there is no data with NaNs and returns it like so. Since basemap doesn't like pandas I then output the data as numpy arrays and plot it.

lons = val_pivot_df.columns.values
lats = val_pivot_df.index.values

fig, ax = plt.subplots(1, figsize=(8,8))

m = Basemap(projection='merc',
        llcrnrlat=val_df.dropna().min().lat-5
        , urcrnrlat=val_df.dropna().max().lat+5
        , llcrnrlon=val_df.dropna().min().lon-5
        , urcrnrlon=val_df.dropna().max().lon+5
        , resolution='i', area_thresh=10000
        )

m.drawcoastlines()
m.drawstates()
m.drawcountries()
m.fillcontinents(color='gray', lake_color='white')#, zorder=0)
m.drawmapboundary(fill_color='0.3')

x, y = np.meshgrid(lons,lats) 
px,py = m(x,y) 

data_values = val_pivot_df.values
masked_data = np.ma.masked_invalid(data_values)

cmap = plt.cm.viridis

m.pcolormesh(px, py, masked_data, cmap=cmap, vmin=0, vmax=2, shading='flat')

m.colorbar()

Upvotes: 3

Related Questions