alextc
alextc

Reputation: 3515

Add a 'time' dimension to xarray Dataset and assign coordinates from another Dataset to it

I have a Dataset object (imported from a netCDF file through xarray.open_dataset) named ds. It contains a variable named variable1 and latitude and longitude dimensions.

>>>ds
<xarray.Dataset>
Dimensions:    (latitude: 681, longitude: 841)
Coordinates:
  * latitude   (latitude) float64 -10.0 -10.05 -10.1 ... -43.9 -43.95 -44.0
  * longitude  (longitude) float64 112.0 112.0 112.1 112.2 ... 153.9 153.9 154.0
Data variables:
    variable1     (latitude, longitude) float32 ...

I have a time DataArray object with coordinates from 2017-01-01 to 2017-13-31.

>>>times = pd.date_range("2017/01/01","2018/01/01",freq='D',closed='left')
>>>time_da = xr.DataArray(times, [('time', times)])
>>>time_da
<xarray.DataArray (time: 365)>
array(['2017-01-01T00:00:00.000000000', '2017-01-02T00:00:00.000000000',
       '2017-01-03T00:00:00.000000000', ..., '2017-12-29T00:00:00.000000000',
       '2017-12-30T00:00:00.000000000', '2017-12-31T00:00:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2017-01-01 2017-01-02 ... 2017-12-31

I would like to add a new dimension called time and assign the coordinates from time_da to it so that the new Dataset ds2 will look like:

>>>ds2
<xarray.Dataset>
Dimensions:    (latitude: 681, longitude: 841, time: 365)
Coordinates:
  * longitude  (longitude) float64 112.0 112.0 112.1 112.2 ... 153.9 153.9 154.0
  * latitude   (latitude) float64 -10.0 -10.05 -10.1 ... -43.9 -43.95 -44.0
  * time       (time) datetime64[ns] 2017-01-01 2017-01-02 ... 2017-12-31
Data variables:
    sm_pct     (time, latitude, longitude) float32 nan nan nan ... nan nan nan

This means the original DataArray [latitude, longitude] will be duplicated by 365 times across the the entire time period in the time dimension.

I tried to use ds.expand_dims to create time dimension and assign time_da to it but this didn't work. The error was:

>>> ds2 = ds.expand_dims(dim='time', axis=0)
>>> ds2.coords['time'] = ('time',time_da)
ValueError: conflicting sizes for dimension 'time': length 1 on <this-array> and length 730 on 'time'

Upvotes: 5

Views: 11087

Answers (1)

paime
paime

Reputation: 3552

There is an appropriate usage of expand_dims for you:

>>> dst = ds.expand_dims(time=time_da)
>>> dst
<xarray.Dataset>
Dimensions:    (latitude: 681, longitude: 841, time: 365)
Coordinates:
  * time       (time) datetime64[ns] 2017-01-01 2017-01-02 ... 2017-12-31
  * latitude   (latitude) int64 0 1 2 3 4 5 6 7 ... 674 675 676 677 678 679 680
  * longitude  (longitude) int64 0 1 2 3 4 5 6 7 ... 834 835 836 837 838 839 840
Data variables:
    variable   (time, latitude, longitude) float64 0.03968 2.156 ... -1.752

Checking that variable is the same at each timestep:

>>> np.all(np.diff(dst["variable"], axis=0) == 0)
True

Upvotes: 7

Related Questions