Reputation: 2146
How can I implement the code in the package Humidity to convert specific humidity into relative humidity using gridded time series data? The required input variables are air temperature, specific humidity and pressure.
Although a solution in CDO is highly desirable, a computationally efficient solution in R will suffice.
Given that the actual data size per variable is >10GB, computational efficiency would be helpful.
Sample data:
library(raster)
library(humidity)
temperature <- brick(nl=10)
values(temperature ) <- 273.15
pressure<- brick(nl=10)
values(pressure) <- 101325
specific_humidity<-brick(nl=10)
values(specific_humidity) <- 0.0002153928
The function of interest in the Humidity
package is SH2RH
. The raw code is found in this link and the constants can be sourced from here.
In the end I will write the output to file in NETCDF
format.
Upvotes: 0
Views: 525
Reputation: 5747
If your data is in a data.frame or comparable format and it has columns for specific humidity (spec_humidity), temperature (temp), and pressure (press) at each time/location, then the following would work:
library(dplyr)
library(humidity)
df <- df %>%
mutate(rel_humidity = SH2RH(q = spec_humidity, # or whatever the SH data is called
t = temp, # or whatever the temperature data is called
p = press, # or whatever the pressure data is called
isK = TRUE)) # or isK = FALSE is temp in degrees C
A faster version for your large file might be found in the dtplyr
package, which uses data.tables (but otherwise the same notation as dplyr
).
library(data.table)
library(dtplyr)
library(dplyr, warn.conflicts = FALSE)
library(humidity)
df <- lazy_dt(df) # convert df into a data.table. The 'lazy' part tracks operations
df <- df %>%
mutate(rel_humidity = SH2RH(q = spec_humidity, # same as with dplyr
t = temp,
p = press,
isK = TRUE)) # or isK = FALSE is temp in degrees C
Finally, there is an experimental multidplyr
to allow dplyr
functions to be partitioned over multiple cores. Here is a link to its documentation. You would need to install it from github and not cran (as of posting).
# install.packages("devtools")
devtools::install_github("tidyverse/multidplyr")
library(multidplyr)
library(dplyr, warn.conflicts = FALSE)
library(humidity)
cluster <- new_cluster(n) # n is number of cores to devote
df <- df %>% partition(cluster) # this splits the data over the cores in the cluster
df <- df %>%
mutate(rel_humidity = SH2RH(q = spec_humidity, # same as with dplyr, but in parallel
t = temp,
p = press,
isK = TRUE)) %>% # or isK = FALSE is temp in degrees C
collect() # Brings data back to main host session
Upvotes: 3