Reputation: 11
I have a URL on the pH of the gulf of Alaska. I cannot open the variable normally (
dput(`pH <- ncvar_get(nc, "pH")`)
) because the dataset is too big, so I should choose some subsets. This variable has 4 dimensions: time, depth, latitude and longitude in rho points. I need to extrapolate the dataset from specific locations (that I know in natural coordinates) in the gulf of Alaska. My question is how do I get this data? What are the commands on R to get this data? Do I need to look for something in specific?
I tried to use other packages (raster or oce) but it did not really work. I have the the maximum and minimum latitude and longitude in both rho-points and natural coordinates. I tried to ask to Chat GPT for some commands (like the "neighbor approach") that could help me pinpoint the areas that I need, but it did not work as well.
This is my code so far:
library (ncdf4)
#Getting the file
url <- "https://thredds.aoos.org/thredds/dodsC/AOOS_GAK_ROMS_BGC_V2_MONTHLY_DIAG.nc"
nc <- nc_open(url)
# getting to know the variables
names(nc$var)
# getting variable pH characteristics since is to be to open like any other variable
pH_var <- nc$var$"pH"
pH_dim_names <- pH_var$dim
print(pH_dim_names)
print(pH)
## these are all the four dimensions of the variable: eta_rho, xi_rho, depth and ocean_time
# time
ocean_time_dim <- nc$dim$"ocean_time"
print(ocean_time_dim)
## We know that the length is 348 units
## we know that the units is "seconds since 1900-01-01 00:00:00", which means that for example the first variable 2936131200 represents the number of seconds since 1900-01-01
## we also know that the dataset range is from 1993-01-01T00:00:00Z to 2021-12-31T00:00:00Z
# the goal is to turn the units in seconds to monthly average where YY-MM-DD
start_timestamp <- as.numeric(ocean_time_dim$start)
time_values_seconds <- ocean_time_dim$vals
time_values <- as.POSIXct(time_values_seconds, origin = "1900-01-01", tz = "UTC")
### To check the numbers:
print(head(time_values))
print(tail(time_values))
print(tail(time_values, 10))
# depth
depth <- ncvar_get(nc, "depth")
print(depth)
dim(depth)
class(depth)
summary (depth)
## so we know that there are 33 values.
## the range goes from 0 (max measure) to -5500 (min measure)
## latitude and longitude
lon <- ncvar_get (nc, "lon")
lat <- ncvar_get(nc, "lat")
attributes(lon)
attributes (lat)
class(lon)
class (lat)
longitude_range <- range(lon)
print(longitude_range)
latitude_range <-range(lat)
print (latitude_range)
lon_attrs <- ncatt_get(nc, "lon")
lat_attrs <- ncatt_get(nc, "lat")
print(lon_attrs)
print(lat_attrs)
### the dimensions are 326 and 282 (?)
### it's a matrix and array
### the range for the latitude is 47.99876 <->65.25184
### the range for the longitude is 197.9042<-> 228.9531
## we know that this is the latitude and longitude are in RHO-points, so they are not natural coordinates
## the question is that how do i turn them into natural coordinates
## I know from the metadata:
#### the geographical North= 62 and South= 48
### the geographical bound coort East = -133 and West = -163
Upvotes: 1
Views: 56
Reputation: 32326
Your data has a so-called rotated latitude-longitude grid. What is north in your data is not geographic north. The only package in R that deals with this correctly is ncdfCF
:
library(ncdfCF)
url <- "https://thredds.aoos.org/thredds/dodsC/AOOS_GAK_ROMS_BGC_V2_MONTHLY_DIAG.nc"
(ds <- open_ncdf(url))
#> <Dataset> AOOS_GAK_ROMS_BGC_V2_MONTHLY_DIAG
#> Resource : https://thredds.aoos.org/thredds/dodsC/AOOS_GAK_ROMS_BGC_V2_MONTHLY_DIAG.nc
#> Format : classic
#> Type : generic netCDF data
#> Conventions: CF-1.7,ACDD-1.3
#> Keep open : FALSE
#>
#> Variables:
#> name long_name units data_type axes
#> aggloss_n_100_di Loss to Aggregation of Diazotroph Phytoplankton... mol.m-2.s-1 NC_FLOAT xi_rho, eta_rho, ocean_time
#> aggloss_n_100_lg Loss to Aggregation of Large Phytoplankton Over... mol.m-2.s-1 NC_FLOAT xi_rho, eta_rho, ocean_time
#> ... (many variables omitted)
#> pH pH (Derived solely from H+ ion concentration) mol.m-2.s-1 NC_FLOAT xi_rho, eta_rho, depth, ocean_time
#> ... (many more variables omitted)
#> zloss_n_100_sm Loss to Zooplankton of Small Phytoplankton Over... mol.m-2.s-1 NC_FLOAT xi_rho, eta_rho, ocean_time
#> zloss_n_100_smz Loss to Zooplankton of Small Zooplankton Over U... mol.m-2.s-1 NC_FLOAT xi_rho, eta_rho, ocean_time
#>
#> Axes:
#> id axis name long_name length values unit
#> 0 depth depth from free surface 33 [0 ... -5500] meters
#> 2 T ocean_time averaged time since initialization 372 [1993-01-16 00:00:00 ... 2023-12-15 12:00:00] seconds since 1900-01-01 00:00:00
#> 1 eta_rho 282 [1 ... 282]
#> 3 xi_rho 326 [1 ... 326]
#>
#> Attributes:
#> ... (75 attributes omitted)
(pH <- ds[["pH"]])
#> <Variable> pH
#> Long name: pH (Derived solely from H+ ion concentration)
#>
#> Axes:
#> id axis name long_name length values unit
#> 3 xi_rho 326 [1 ... 326]
#> 1 eta_rho 282 [1 ... 282]
#> 0 depth depth from free surface 33 [0 ... -5500] meters
#> 2 T ocean_time averaged time since initialization 372 [1993-01-16 00:00:00 ... 2023-12-15 12:00:00] seconds since 1900-01-01 00:00:00
#>
#> Auxiliary longitude-latitude grid:
#> Error: Could not read longitude data for auxiliary long-lat grid
(dt <- pH$subset(list(X = c(-156, -148), Y = c(55, 59))))
#> Error: Could not read latitude data for auxiliary long-lat grid
Oops! The file is not readable, corrupt, or I have insufficient access rights to read a large chunk of data.
The pH$subset()
method would use the included auxiliary latitude-longitude grids that give the geographic location for every rho point, but as you can see, I cannot read those grids. If you can on your end, then dt
is a CFData
object. You can then access the oriented array of data (meaning that the data is oriented in the regular R order) with the dt$array()
method or straight to a terra::SpatRasterDataset
with dt$terra()
.
Upvotes: 0