code123
code123

Reputation: 2146

Convert specific humidity into relative humidity in R or CDO for gridded data

How can I implement the code in the package Humidity to convert specific humidity into relative humidity using gridded time series data? The required input variables are air temperature, specific humidity and pressure.

Although a solution in CDO is highly desirable, a computationally efficient solution in R will suffice.

Given that the actual data size per variable is >10GB, computational efficiency would be helpful.

Sample data:

library(raster)
library(humidity)

temperature <- brick(nl=10)
values(temperature ) <- 273.15

pressure<- brick(nl=10)
values(pressure) <- 101325

specific_humidity<-brick(nl=10)
values(specific_humidity) <- 0.0002153928

The function of interest in the Humidity package is SH2RH. The raw code is found in this link and the constants can be sourced from here.

In the end I will write the output to file in NETCDF format.

Upvotes: 0

Views: 525

Answers (1)

Ben Norris
Ben Norris

Reputation: 5747

If your data is in a data.frame or comparable format and it has columns for specific humidity (spec_humidity), temperature (temp), and pressure (press) at each time/location, then the following would work:

library(dplyr)
library(humidity)
df <- df %>%
  mutate(rel_humidity = SH2RH(q = spec_humidity,   # or whatever the SH data is called
                              t = temp,            # or whatever the temperature data is called
                              p = press,           # or whatever the pressure data is called
                              isK = TRUE))         # or isK = FALSE is temp in degrees C

A faster version for your large file might be found in the dtplyr package, which uses data.tables (but otherwise the same notation as dplyr).

library(data.table)
library(dtplyr)
library(dplyr, warn.conflicts = FALSE)
library(humidity)
df <- lazy_dt(df)  # convert df into a data.table. The 'lazy' part tracks operations
df <- df %>%
  mutate(rel_humidity = SH2RH(q = spec_humidity,   # same as with dplyr
                              t = temp,           
                              p = press,          
                              isK = TRUE))         # or isK = FALSE is temp in degrees C

Finally, there is an experimental multidplyr to allow dplyr functions to be partitioned over multiple cores. Here is a link to its documentation. You would need to install it from github and not cran (as of posting).

# install.packages("devtools")
devtools::install_github("tidyverse/multidplyr")
library(multidplyr)
library(dplyr, warn.conflicts = FALSE)
library(humidity)
cluster <- new_cluster(n)   # n is number of cores to devote
df <- df %>% partition(cluster)   # this splits the data over the cores in the cluster
df <- df %>%
  mutate(rel_humidity = SH2RH(q = spec_humidity,   # same as with dplyr, but in parallel
                              t = temp,           
                              p = press,          
                              isK = TRUE)) %>%     # or isK = FALSE is temp in degrees C
  collect()                                        # Brings data back to main host session

Upvotes: 3

Related Questions