Reputation: 2575
Here is my data frame. For all the page_nums between section_start and section_end, I want to create fill_section_num with section_num values. For example, if the page_nums is between 5 through 6, the section_num is 2. So, in the fill_section_num, I want 2 for 5th and 6th row and so on.
df <- tribble(
~page_nums, ~section_start, ~section_end, ~section_num, ~fill_section_num,
1, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
2, NA_integer_, NA_integer_, NA_integer_, 1,
3, 2, 4, 1, 1,
4, 5, 6, 2, 1,
5, 7, 7, 3, 2,
6, 8, 10, 4, 2,
7, NA_integer_, NA_integer_, NA_integer_, 3,
8, NA_integer_, NA_integer_, NA_integer_, 4,
9, NA_integer_, NA_integer_, NA_integer_, 4,
10, NA_integer_, NA_integer_, NA_integer_, 4)
I was planing to do something like this, but failed:
df <- df %>%
mutate(fill_section_num = if_else(between(page_nums, section_start, section_end), section_num, NA_real_))
The desired output is the fill_section_num column calculated using the R code.
Upvotes: 0
Views: 63
Reputation: 79228
You can expand the section start:section_end, then unnest and finally join the table to itself:
df%>%
mutate(nn=map2(section_start,section_end,
~if(any(is.na(c(.x,.y)))) NA else .x:.y))%>%
unnest()%>%
select(nn,section_num)%>%
right_join(df,c(nn="page_nums"))%>%
select(-section_num.x,everything())
# A tibble: 10 x 6
nn section_start section_end section_num.y fill_section_num section_num.x
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 NA NA NA NA NA
2 2 NA NA NA 1 1
3 3 2 4 1 1 1
4 4 5 6 2 1 1
5 5 7 7 3 2 2
6 6 8 10 4 2 2
7 7 NA NA NA 3 3
8 8 NA NA NA 4 4
9 9 NA NA NA 4 4
10 10 NA NA NA 4 4
The last column above is what you are looking for.
Upvotes: 1
Reputation: 6073
One way to proceed:
# example data (with simpler varnames because I'm lazy):
df <- data.frame(
pages = 3:6,
start = c(2,5,7,8),
end = c(4,6,7,10),
section = 1:4
)
# create two-column dataframe of all page numbers and the fill_section they belong to
myfun <- function(x) {
cbind( pages = seq(x[1], x[2]),
full_section = rep(x[3], times=x[2]-x[1]+1) )
}
temp_list <- apply(t(df[,2:4]), 2, myfun)
temp_df <- data.frame(do.call(rbind, temp_list))
# Merge this fill_section information back onto the original dataframe:
result <- merge(df, temp_df, by="pages")
Check that the result makes sense:
result
pages start end section fill_section
3 2 4 1 1
4 5 6 2 1
5 7 7 3 2
6 8 10 4 2
Upvotes: 1