Jared Brewer
Jared Brewer

Reputation: 154

Interpolating Nested Data in R

I'm a relative newcomer to purrr and nested data, and I'm having a problem I don't understand. Basically, I have a series of discontinuous vertical profiles (separated by region and pressure level) that I need to interpolate. The DPUT is below, but I need to interpolate from data that is at 200, 300, 500, 700, and 1000 to data at, for example, seq(100,1000,100).

This seems like a perfect question for nest(), so I tried the following:

OH_Interp <- as.tibble(Spiv.OH) %>%
  group_by(Month, Region) %>%
  nest() %>%
  mutate(interpolation = modify(data, spline,
                                      xout = seq(100,1000,100))) %>%
  select(Month, Region, interpolation) %>%
  unnest()

I expected this to return a 40x4 dataframe mirroring the structure of the input dataframe, with the interpolated data in place of the original data. The spline() appears to have worked correctly, but because the interpolation structure didn't preserve the structure of data, I can't seem to use unnest() as I expect - instead, it merely destroy the internal structure of the interpolation list and doubles the length of the output to an 8x3 tibble. How can I make it output a 40x4 dataframe instead?

Thank you in advance for your help!

The dataframe in question:

structure(list(Pressure = c(1000L, 900L, 800L, 700L, 500L, 300L, 200L, 1000L, 900L, 800L, 700L, 500L, 300L, 200L, 1000L, 900L, 800L, 700L, 500L, 300L, 200L, 1000L, 900L, 800L, 700L, 500L, 300L, 200L), Value = c(8.12, 11.76, 17.14, 20.5, 21.08, 14.2, 11.2, 7.14, 7.59, 7.98, 8.45, 8.72, 8.94, 8.94, 3.24, 4.12, 4.74, 5.36, 5.34, 4.26, 3.94, 0.09, 0.09, 0.1, 0.11, 0.12, 0.1, 0.1 ), Month = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("January", "July"), class = "factor"), Region = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Polar", "Subtropics"), class = "factor")), .Names = c("Pressure", "Value", "Month", "Region"), class = "data.frame", row.names = c(NA, -28L ))

Jared

Upvotes: 1

Views: 160

Answers (1)

Marius
Marius

Reputation: 60070

If you convert the spline output to a dataframe/tibble, you can unnest it easily:

as.tibble(Spiv.OH) %>%
    group_by(Month, Region) %>%
    nest() %>%
    mutate(interpolation = modify(data, 
                                  function(df, ...) { spline(df, ...) %>% 
                                          as_tibble()
                                  },
                                  xout = seq(100,1000,100))) %>%
    select(Month, Region, interpolation) %>%
    unnest(interpolation)

Output:

# A tibble: 40 x 4
   Month   Region         x     y
   <fct>   <fct>      <dbl> <dbl>
 1 January Subtropics   100 10.4 
 2 January Subtropics   200 11.2 
 3 January Subtropics   300 14.2 
 4 January Subtropics   400 18.1 
 5 January Subtropics   500 21.1 
 6 January Subtropics   600 21.9 
 7 January Subtropics   700 20.5 
 8 January Subtropics   800 17.1 
 9 January Subtropics   900 11.8 
10 January Subtropics  1000  8.12
# ... with 30 more rows

Upvotes: 2

Related Questions