Reputation: 85
I have an Xarray dataset with just two pieces of information, a time referenced by 'time' and a time referenced by 'reftime':
<xarray.Dataset>
Dimensions: ()
Coordinates:
reftime datetime64[ns] 2020-03-31T06:00:00
time datetime64[ns] 2020-03-31T12:00:00
crs object Projection: latitude_longitude
Data variables:
*empty*
Attributes:
Originating_or_generating_Center: ...
Originating_or_generating_Subcenter: ...
GRIB_table_version: ...
Type_of_generating_process: ...
Analysis_or_forecast_generating_process_identifier_defined_by_originating...
Conventions: ...
history: ...
featureType: ...
History: ...
geospatial_lat_min: ...
geospatial_lat_max: ...
geospatial_lon_min: ...
geospatial_lon_max: ...
everything else is empty. My goal is to get whatever date is referenced by 'reftime' into a string format. Normally, I understand that this can be done by calling dataset['reftime'], but the catch is that this code is intended to run in the background and sometimes it will find xarray datasets where the time I want is referenced by 'reftimeX' where X is some number. How can I extract whatever information is stored in the first coordinate (be it reftime, reftime1, or reftimeX') such that it could be stored as a string?
I've tried turning it into a DataArray in the hopes that I could then turn it into a numpy array and extract the string from there, but when I try to turn it into a DataArray:
filtered_dataarray = filtered_ds.to_array()
I get an error:
ValueError: at least one array or dtype is required
The Xarray docs suggest that this function needs some self parameter:
Dataset.to_array(self, dim='variable', name=None)
But thus far I have been unable to figure out to what this is referring.
Upvotes: 1
Views: 3719
Reputation: 890
You can get a list of all coordinates in the dataset like this:
coord_names = list(ds.coords)
If you are sure that the coordinate you want is always the first one, you could access it via
ds[coord_names[0]]
However I would rather go through the coords and check what is the exact name of the coordinate you want. Given that you know that it must contain "reftime"
you could do:
reftime_name = [var for var in ds.coords if "reftime" in var][0]
ds[reftime_name]
The to_array
method does not do what you expect it to. It would take all data variables of the dataset and concatenate them along a new dimension. However, your data set does not contain any data variables (only coords) so it throws an error.
The self
argument indicates that to_array
is an object method. self
is a reference to the current instance of the class. Usually, you would call the method on an object (e.g. ds.to_array()
) and then you do not need to explicitly pass the self
parameter (see also here).
Upvotes: 2