Giulia Poggi
Giulia Poggi

Reputation: 11

Longitude and Latitude in RHO points to natural coordinate in ncdf4 file in R

I have a URL on the pH of the gulf of Alaska. I cannot open the variable normally (

dput(`pH <- ncvar_get(nc, "pH")`)

) because the dataset is too big, so I should choose some subsets. This variable has 4 dimensions: time, depth, latitude and longitude in rho points. I need to extrapolate the dataset from specific locations (that I know in natural coordinates) in the gulf of Alaska. My question is how do I get this data? What are the commands on R to get this data? Do I need to look for something in specific?

I tried to use other packages (raster or oce) but it did not really work. I have the the maximum and minimum latitude and longitude in both rho-points and natural coordinates. I tried to ask to Chat GPT for some commands (like the "neighbor approach") that could help me pinpoint the areas that I need, but it did not work as well.

This is my code so far:

library (ncdf4)
#Getting the file 
url <- "https://thredds.aoos.org/thredds/dodsC/AOOS_GAK_ROMS_BGC_V2_MONTHLY_DIAG.nc"
nc <- nc_open(url)

# getting to know the variables
names(nc$var)


# getting variable pH characteristics since is to be to open like any other variable 
pH_var <- nc$var$"pH"
pH_dim_names <- pH_var$dim
print(pH_dim_names)
print(pH)
## these are all the four dimensions of the variable: eta_rho, xi_rho, depth and ocean_time 

# time 
ocean_time_dim <- nc$dim$"ocean_time"
print(ocean_time_dim)

## We know that the length is 348 units 
## we know that the units is "seconds since 1900-01-01 00:00:00", which means that for example the first variable 2936131200 represents the number of seconds since 1900-01-01
## we also know that the dataset range is from 1993-01-01T00:00:00Z to 2021-12-31T00:00:00Z


# the goal is to turn the units in seconds to monthly average where YY-MM-DD
start_timestamp <- as.numeric(ocean_time_dim$start)
time_values_seconds <- ocean_time_dim$vals
time_values <- as.POSIXct(time_values_seconds, origin = "1900-01-01", tz = "UTC")
### To check the numbers: 
print(head(time_values))
print(tail(time_values))
print(tail(time_values, 10))



# depth 
depth <- ncvar_get(nc, "depth")
print(depth)
dim(depth)
class(depth)
summary (depth)

## so we know that there are 33 values.
## the range goes from 0 (max measure) to -5500 (min measure)

## latitude and longitude

lon <- ncvar_get (nc, "lon")
lat <- ncvar_get(nc, "lat")

attributes(lon)
attributes (lat)
class(lon)
class (lat)

longitude_range <- range(lon)
print(longitude_range)
latitude_range <-range(lat)
print (latitude_range)

lon_attrs <- ncatt_get(nc, "lon")
lat_attrs <- ncatt_get(nc, "lat")
print(lon_attrs)
print(lat_attrs)

### the dimensions are 326 and 282 (?)
### it's a matrix and array 
### the range for the latitude is  47.99876 <->65.25184
### the range for the longitude is 197.9042<-> 228.9531

## we know that this is the latitude and longitude are in RHO-points, so they are not natural coordinates 
## the question is that how do i turn them into natural coordinates 

## I know from the metadata: 
#### the geographical North= 62 and South= 48
### the geographical bound coort East = -133 and West = -163

Upvotes: 1

Views: 56

Answers (1)

Patrick
Patrick

Reputation: 32326

Your data has a so-called rotated latitude-longitude grid. What is north in your data is not geographic north. The only package in R that deals with this correctly is ncdfCF:

library(ncdfCF)

url <- "https://thredds.aoos.org/thredds/dodsC/AOOS_GAK_ROMS_BGC_V2_MONTHLY_DIAG.nc"
(ds <- open_ncdf(url))
#> <Dataset> AOOS_GAK_ROMS_BGC_V2_MONTHLY_DIAG 
#> Resource   : https://thredds.aoos.org/thredds/dodsC/AOOS_GAK_ROMS_BGC_V2_MONTHLY_DIAG.nc 
#> Format     : classic 
#> Type       : generic netCDF data 
#> Conventions: CF-1.7,ACDD-1.3 
#> Keep open  : FALSE 
#> 
#> Variables:
#>  name              long_name                                          units                  data_type axes
#>  aggloss_n_100_di  Loss to Aggregation of Diazotroph Phytoplankton... mol.m-2.s-1 NC_FLOAT  xi_rho, eta_rho, ocean_time
#>  aggloss_n_100_lg  Loss to Aggregation of Large Phytoplankton Over... mol.m-2.s-1 NC_FLOAT  xi_rho, eta_rho, ocean_time
#>  ... (many variables omitted)    
#>  pH                pH (Derived solely from H+ ion concentration)      mol.m-2.s-1 NC_FLOAT  xi_rho, eta_rho, depth, ocean_time
#>  ... (many more variables omitted)
#>  zloss_n_100_sm    Loss to Zooplankton of Small Phytoplankton Over... mol.m-2.s-1 NC_FLOAT  xi_rho, eta_rho, ocean_time
#>  zloss_n_100_smz   Loss to Zooplankton of Small Zooplankton Over U... mol.m-2.s-1 NC_FLOAT  xi_rho, eta_rho, ocean_time  
#> 
#> Axes:
#>  id axis name       long_name                          length values                                        unit                        
#>  0       depth      depth from free surface             33    [0 ... -5500]                                 meters                 
#>  2  T    ocean_time averaged time since initialization 372    [1993-01-16 00:00:00 ... 2023-12-15 12:00:00] seconds since 1900-01-01 00:00:00
#>  1       eta_rho                                       282    [1 ... 282]
#>  3       xi_rho                                        326    [1 ... 326]                            
#> 
#> Attributes:
#> ... (75 attributes omitted)

(pH <- ds[["pH"]])
#> <Variable> pH 
#> Long name: pH (Derived solely from H+ ion concentration) 
#> 
#> Axes:
#>  id axis name       long_name                          length values                                        unit
#>  3       xi_rho                                        326    [1 ... 326]
#>  1       eta_rho                                       282    [1 ... 282]
#>  0       depth      depth from free surface             33    [0 ... -5500]                                 meters
#>  2  T    ocean_time averaged time since initialization 372    [1993-01-16 00:00:00 ... 2023-12-15 12:00:00] seconds since 1900-01-01 00:00:00
#>                                   
#> Auxiliary longitude-latitude grid:
#> Error: Could not read longitude data for auxiliary long-lat grid
(dt <- pH$subset(list(X = c(-156, -148), Y = c(55, 59))))
#> Error: Could not read latitude data for auxiliary long-lat grid

Oops! The file is not readable, corrupt, or I have insufficient access rights to read a large chunk of data.

The pH$subset() method would use the included auxiliary latitude-longitude grids that give the geographic location for every rho point, but as you can see, I cannot read those grids. If you can on your end, then dt is a CFData object. You can then access the oriented array of data (meaning that the data is oriented in the regular R order) with the dt$array() method or straight to a terra::SpatRasterDataset with dt$terra().

Upvotes: 0

Related Questions