Reputation: 21
I have this dataset with as possible coordinates stations, lat, lon and time. Right now the dataset uses (stations, time) as dimensions but I would like it to use (lat, lon, time).
I looked online and found how to swap dimensions but I could only find it applied to swapping one dimension.
Any suggestions on how to do this?
<xarray.Dataset>
Dimensions: (stations: 11, time: 7320)
Coordinates:
* stations (stations) int64 11425 11426 11427 11428 ... 11433 11434 11435
lat (stations) float64 39.54 39.36 39.24 39.07 ... 38.07 37.9 37.81
lon (stations) float64 -74.25 -74.4 -74.6 ... -75.19 -75.34 -75.51
* time (time) datetime64[ns] 2010-02-01 ... 2010-02-06T01:59:00
Data variables:
waterlevel (time, stations) float64 0.0002405 0.0002313 ... -0.01266
Upvotes: 2
Views: 864
Reputation: 900
You can make the station
coordinate a MultiIndex of the lat
and lon
coordinates using set_index
(as explained here). In a second step you can then unstack the MultiIndex to make lat
and lon
the dataset dimensions. Note, however, that this will blow up the size of your dataset (unless the station are already on a regular grid), filling up grid points without a station with NaN values. For many applications, making the station
dimension a MultiIndex of lat
and lon
should be enough.
import numpy as np
import pandas as pd
import xarray as xr
ds = xr.Dataset(
data_vars={"waterlevels": (("station", "time"), np.random.rand(5, 20))},
coords={
"station": ("station", ["a", "b", "c", "d", "e"]),
"lon": ("station", np.random.rand(5)),
"lat": ("station", np.random.rand(5)),
"time": pd.date_range(start="10-05-2021", periods=20, freq="d"),
},
)
# Rename the station coordinate so that you don't overwrite it
ds = ds.rename_vars({"station": "station_id"})
# Create MultiIndex coordinate
ds_multiindex = ds.set_index(
station=["lat", "lon"]
)
# Unstack the MultiIndex
ds_multiindex.unstack()
Upvotes: 1