Jaken
Jaken

Reputation: 521

Extracting time from an unconventionally-named netcdf using terra

I have a netcdf file (ERA5 climate data) that I'm converting to a SpatRaster using terra. I would like to access the time data for each layer; however, the date element of the original file is named valid_time rather than something more traditional. Possibly because of this, it does not seem to be recognized by the terra::time argument.

Here is the code used to process the netcdf file. Apologies for the lack of reproducible example, this problem is specific to the raster and I'm not sure how to recreate it:

library(terra)
ncfile <- 'filepath'
sample_rast <- terra::rast(ncfile)

sample_rast
class       : SpatRaster 
dimensions  : 1801, 3600, 36  (nrow, ncol, nlyr)
resolution  : 0.1, 0.1  (x, y)
extent      : -0.05, 359.95, -90.05, 90.05  (xmin, xmax, ymin, ymax)
coord. ref. :  
source      : demo4.nc:t2m 
varname     : t2m (2 metre temperature) 
names       : t2m_v~70400, t2m_v~48800, t2m_v~68000, t2m_v~46400, t2m_v~38400, t2m_v~16800, ... 
unit        :           K,           K,           K,           K,           K,           K, ... 
#this is where time appears in examples; it is not present in my raster

sample_dates <- terra::time(sample_rast)
sample_dates

[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

I can access the time data using ncdf4, using the following code:

library(ncdf4)
nc <- nc_open(ncfile)
date_array <- ncvar_get(nc, "valid_time")
nc_close(nc)

This makes me suspect that it's a naming problem, but I don't see anything in the doc for terra::time that would help me direct it to look for "valid_time".

I can use the date array from ncdf4, obviously, but I'd like to see if there's a way to get this same information through terra.

Upvotes: 1

Views: 131

Answers (2)

Patrick
Patrick

Reputation: 32326

ERA5 files use "time" as the name for the temporal axis of the data so I am curious to learn what specific ERA5 data you have to work with. Is this maybe the new netCDF format used by ECMWF for hourly data in a beta release?

With package ncdfCF you can read netCDF files including the attributes that define much of what is going on inside. That latter part is quite important because the CF Metadata Conventions that many netCDF files adhere to (probably most, as it is the recommended standard) use the attributes to describe many important aspects of the data variables, including its "time" dimension. Handily, ncdfCF can also produce a terra::SpatRaster for you:

library(ncdfCF)

# Open the file
nc <- open_ncdf(nc_file)

# Read the data variable of interest
(t2m <- nc[["t2m"]]$data())
> <Data> t2m 
> Long name: 2 metre temperature 
> 
> Values: [273.4626 ... 318.3601] K
>     NA: 14843752 (9.1%)
> 
> Axes:
>  id axis name      length values                      unit                             
>  0  X    longitude    78  [100 ... 107.7]             degrees_east                     
>  1  Y    latitude     87  [22.5 ... 13.9]             degrees_north                    
>  2  T    time      24097  [2019-01-01 ... 2021-10-01] hours since 1900-01-01 00:00:00.0
> 
> Attributes:
>  id name          type      length value               
>  0  scale_factor  NC_DOUBLE  1     0.000685112247836968
>  1  add_offset    NC_DOUBLE  1     295.911034397001    
>  2  _FillValue    NC_SHORT   1     -32767              
>  3  missing_value NC_SHORT   1     -32767              
>  4  units         NC_CHAR    1     K                   
>  5  long_name     NC_CHAR   19     2 metre temperature 

# Make a SpatRaster
r <- t2m$terra()

# ncdfCF puts the 3rd dimension dimnames as layer names. In the case of single-layer 
# ERA5 files that is the "time" dimension. The "time" property itself is not set.
names(r)[1:10]
>  [ 1] "2019-01-01 00:00:00" "2019-01-01 01:00:00" "2019-01-01 02:00:00"
>  [ 4] "2019-01-01 03:00:00" "2019-01-01 04:00:00" "2019-01-01 05:00:00"
>  [ 7] "2019-01-01 06:00:00" "2019-01-01 07:00:00" "2019-01-01 08:00:00"
>  [10] "2019-01-01 09:00:00"
has.time(r)
> FALSE

# Set the time information in the SpatRaster. Note that this should be a POSIXt
# but I get an error when using that
time(r) <- as.Date(names(r))
has.time(r)
> TRUE

time(r)[1:10]
>  [1] "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01"
>  [6] "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01"

A few things to note:

  • Sub-daily precision is lost because using a POSIXt on time(r)<- gives an error. If you diff(time(r)) you can see how hourly information is lost. If you need hourly information, use the information from ncdfCF or names(r).
  • If you go with the terra time information, be aware of the time zone because the ERA5 UTC is converted to whatever timezone your computer is running at.

Upvotes: 1

Robert Hijmans
Robert Hijmans

Reputation: 47481

This is what I see when I use an ERA5 file

library(terra)
#terra 1.7.83
r <- rast("total_precipitation-20171228.nc")
r
#class       : SpatRaster 
#dimensions  : 1801, 3600, 24  (nrow, ncol, nlyr)
#resolution  : 0.1, 0.1  (x, y)
#extent      : -0.05, 359.95, -90.05, 90.05  (xmin, xmax, ymin, ymax)
#coord. ref. : +proj=longlat +datum=WGS84 +no_defs 
#source      : total_precipitation-20171228.nc 
#varname     : tp (Total precipitation) 
#names       : tp_1, tp_2, tp_3, tp_4, tp_5, tp_6, ... 
#unit        :    m,    m,    m,    m,    m,    m, ... 
#time        : 2017-12-28 to 2017-12-28 23:00:00 UTC 

head(time(r))
#[1] "2017-12-28 00:00:00 UTC" "2017-12-28 01:00:00 UTC" "2017-12-28 02:00:00 UTC" "2017-12-28 03:00:00 UTC" "2017-12-28 04:00:00 UTC"
#[6] "2017-12-28 05:00:00 UTC"

Can you point to a specific file or data source that does not work as expected?

Upvotes: 0

Related Questions