docjay
docjay

Reputation: 757

R - (Tidyverse) Add Rows to Dataframe Within Intervals

I've been playing around with tidyverse and I am currently a little stuck. The goal is to take a data frame in long format and turn it into wide format. This should be done by taking two variables that represent bounds (lower and upper) and create a new row where the lower bound has been incremented by one but all other information is duplicated.

I've been trying combinations of functions but I can't think of a clever solution which I'm sure exists in the tideverse family.

Create example dataframe: (NA values for completeness)

df_example <- data.frame(l_bound = c('A00', 'B00', 'C00'), 
                         u_bound = c('A05', 'B03', NA), 
                         value = 1:3)

Output:

  l_bound u_bound value
1     A00     A05     1
2     B00     B03     2
3     C00    <NA>     3

Desired outcome:

   result value
1     A00     1
2     A01     1
3     A02     1
4     A03     1
5     A04     1
6     A05     1
7     B00     2
8     B01     2
9     B02     2
10    B03     2
11    C00     3

Any help will be greatly appreciated!

Upvotes: 2

Views: 239

Answers (1)

akrun
akrun

Reputation: 887721

Here is an option using tidyverse. We use map2 to get the sequence of corresponding element of 'l_bound', 'u_bound' in a list after extracting the numeric part (parse_number) and unnest after selecting only the relevant columns

library(tidyverse)
library(readr)
df_example %>%
     mutate(result = map2(l_bound, u_bound, ~ 
      if(!is.na(.y)) 
        paste0(substr(.x, 1, 2), parse_number(.x):parse_number(.y)) 
      else as.character(.x))) %>% 
    select(result, value) %>% 
     unnest

Or using similar methodology in data.table

library(data.table)
setDT(df_example)[, if(!is.na(u_bound)).(response = paste0(substr(l_bound,
      1, 2), parse_number(l_bound):parse_number(u_bound))) 
       else as.character(l_bound), value]

Upvotes: 2

Related Questions