Reputation: 424
I have netCDF file, which contains temperature data over some location. Data shape is 1450x900.
I am creating search functionality in my app, to locate temperature data with lat, lon values.
So I extracted lat and lon coordinates data from netCDf file, but I was expecting that they would be 1D arrays and instead got 2D arrays with 1450x900 shape for both coordinates.
So my question: why they are 2d arrays, instead of 1450 latitude values and 900 longitude values? Doesn't 1450 lat values and 900 lon values describe whole grid?
Lets say we have 4x5 square, indices for locating rightmost and bottom-most point of the grid will be [4, 5]. So my indices for x will be[1, 2, 3, 4] and for y: [1, 2, 3, 4, 5]. 9 indices in total are enough to locate any point on that grid (consisting of 20 cells). So why do lat (x) and lon (y) coordinates in netcdf file contain 20 indices separately (40 in total), instead of 4 and 5 indices respectively (9 in total)? Hope you get what confuses me.
Is it possible to somehow map those 2D arrays and "downgrade" to 1450 latitude values and 900 longitude values? OR it is ok as it is right now? How can I use those values for my intention? Do I need to zip lat lon arrays?
here are the shapes:
>>> DS = xarray.open_dataset('file.nc')
>>> DS.tasmin.shape
(31, 1450, 900)
>>> DS.projection_x_coordinate.shape
(900,)
>>> DS.projection_y_coordinate.shape
(1450,)
>>> DS.latitude.shape
(1450, 900)
>>> DS.longitude.shape
(1450, 900)
consider that projection_x_coordinate
and projection_y_coordinate
are easting/northing values not lat/longs
here is the metadata of file if needed:
Dimensions: (bnds: 2, projection_x_coordinate: 900, projection_y_coordinate: 1450, time: 31) Coordinates: * time (time) datetime64[ns] 2018-12-01T12:00:00 .... * projection_y_coordinate (projection_y_coordinate) float64 -1.995e+0... * projection_x_coordinate (projection_x_coordinate) float64 -1.995e+0... latitude (projection_y_coordinate, projection_x_coordinate) float64 ... longitude (projection_y_coordinate, projection_x_coordinate) float64 ... Dimensions without coordinates: bnds Data variables: tasmin (time, projection_y_coordinate, projection_x_coordinate) float64 ... transverse_mercator int32 ... time_bnds (time, bnds) datetime64[ns] ... projection_y_coordinate_bnds (projection_y_coordinate, bnds) float64 ... projection_x_coordinate_bnds (projection_x_coordinate, bnds) float64 ... Attributes: comment: Daily resolution gridded climate observations creation_date: 2019-08-21T21:26:02 frequency: day institution: Met Office references: doi: 10.1002/joc.1161 short_name: daily_mintemp source: HadUK-Grid_v1.0.1.0 title: Gridded surface climate observations data for the UK version: v20190808 Conventions: CF-1.5
Upvotes: 2
Views: 3309
Reputation: 13079
Your data adheres to version 1.5 of the Climate and Forecast conventions.
The document describing this version of the conventions is here, although the relevant section is essentially unchanged across many versions of the conventions.
See section 5.2:
5.2. Two-Dimensional Latitude, Longitude, Coordinate Variables
The latitude and longitude coordinates of a horizontal grid that was not defined as a Cartesian product of latitude and longitude axes, can sometimes be represented using two-dimensional coordinate variables. These variables are identified as coordinates by use of the coordinates attribute.
It looks like you are using the HadOBS 1km resolution gridded daily minimum temperature, and this file in particular:
http://dap.ceda.ac.uk/thredds/fileServer/badc/ukmo-hadobs/data/insitu/MOHC/HadOBS/HadUK-Grid/v1.0.1.0/1km/tasmin/day/v20190808/tasmin_hadukgrid_uk_1km_day_20181201-20181231.nc (warning: >300MB download)
As it states, the data is on a transverse mercator grid.
If you look at output from ncdump -h <filename>
you will also see the following description of the grid expressed by means of attributes of the transverse_mercator
dummy variable:
int transverse_mercator ;
transverse_mercator:grid_mapping_name = "transverse_mercator" ;
transverse_mercator:longitude_of_prime_meridian = 0. ;
transverse_mercator:semi_major_axis = 6377563.396 ;
transverse_mercator:semi_minor_axis = 6356256.909 ;
transverse_mercator:longitude_of_central_meridian = -2. ;
transverse_mercator:latitude_of_projection_origin = 49. ;
transverse_mercator:false_easting = 400000. ;
transverse_mercator:false_northing = -100000. ;
transverse_mercator:scale_factor_at_central_meridian = 0.9996012717 ;
and you will also see that the coordinate variables projection_x_coordinate
and projection_y_coordinate
have units of metres.
The grid in question is the Ordnance Survey UK grid using numeric grid references.
See for example this description of the OS grid (from Wikipedia).
If you wish to express the data on a regular longitude-latitude grid then you will need to do some type of interpolation. I see that you are using xarray. You can combine this with pyresample
to do the interpolation. Here is an example:
import xarray as xr
import numpy as np
from pyresample.geometry import SwathDefinition
from pyresample.kd_tree import resample_nearest, resample_gauss
ds = xr.open_dataset("tasmin_hadukgrid_uk_1km_day_20181201-20181231.nc")
# Define a target grid. For sake of example, here is one with just
# 3 longitudes and 4 latitudes.
lons = np.array([-2.1, -2., -1.9])
lats = np.array([51.7, 51.8, 51.9, 52.0])
# The target grid is regular (1-d lon, lat coordinates) but we will need
# a 2d version (similar to the input grid), so use numpy.meshgrid to produce this.
lon2d, lat2d = np.meshgrid(lons, lats)
origin_grid = SwathDefinition(lons=ds.longitude, lats=ds.latitude)
target_grid = SwathDefinition(lons=lon2d, lats=lat2d)
# get a numpy array for the first timestep
data = ds.tasmin[0].to_masked_array()
# nearest neighbour interpolation example
# Note that radius_of_influence has units metres
interpolated = resample_nearest(origin_grid, data, target_grid, radius_of_influence=1000)
# GIVES:
# array([[5.12490065, 5.02715332, 5.36414835],
# [5.08337723, 4.96372838, 5.00862833],
# [6.47538931, 5.53855722, 5.11511239],
# [6.46571817, 6.17949381, 5.87357538]])
# gaussian weighted interpolation example
# Note that radius_of_influence and sigmas both have units metres
interpolated = resample_gauss(origin_grid, data, target_grid, radius_of_influence=1000, sigmas=1000)
# GIVES:
# array([[5.20432465, 5.07436805, 5.39693221],
# [5.09069187, 4.8565934 , 5.08191639],
# [6.4505963 , 5.44018209, 5.13774416],
# [6.47345359, 6.2386732 , 5.62121948]])
Upvotes: 7
Reputation: 424
I figured an answer by myself.
As appeared 2D lat long arrays are used to define the "grid" of some location.
In other words, if we zip
lat long values and project on the map, we will get "curved grid" (earth curvature is considered in other words) over some location, which then are used to create grid reference of location.
Hope its clear for anyone interested.
Upvotes: 1