Reputation: 37
I have two columns, and the difference of the columns in another, as such:
df <- data.frame(start_value = c(111000, 839201, 667293, 198374),
end_value = c(111006, 839212, 667300, 198379))
What I'd like is to create separate columns that have the next value after start_value until end_value is met (aka until the difference between two columns is met), for each row. like this:
new_df <- data.frame(start_value = c(111000, 839201, 667293, 198374),
second_value = c(111001, 839202, 667294, 198375),
third_value = c(111002, 839203, 667295, 198376),
fourth_value = c(111003, 839204, 667296, 198377),
fifth_value = c(111004, NA, 667297, 198378),
sixth_value = c(NA, NA, 667297, 198378),
seventh_value = c(NA, NA, NA, 198379),
end_value = c(111004, 667296, 667297, 198379))
The row names are negligible and can be anything. Basically, I just need to create rows that count starting from start_value until end_value.
Upvotes: 0
Views: 67
Reputation: 16971
A couple one-line options (returns a data.frame
):
plyr::ldply(mapply(seq, df$start_value, df$end_value), rbind)
or base R (returns a matrix):
t(sapply(mapply(seq, df$start_value, df$end_value), "length<-", max(df$end_value - df$start_value + 1L)))
Upvotes: 0
Reputation: 79188
cbind(df[1], plyr::rbind.fill.matrix(Map(function(x, y)t(seq(x, y)), df$start_value, df$end_value)), df[2])
start_value 1 2 3 4 5 6 7 8 9 10 11 12 end_value
1 111000 111000 111001 111002 111003 111004 111005 111006 NA NA NA NA NA 111006
2 839201 839201 839202 839203 839204 839205 839206 839207 839208 839209 839210 839211 839212 839212
3 667293 667293 667294 667295 667296 667297 667298 667299 667300 NA NA NA NA 667300
4 198374 198374 198375 198376 198377 198378 198379 NA NA NA NA NA NA 198379
Using tidyverse:
df %>%
rowwise() %>%
mutate(dat = list(seq(start_value, end_value)))%>%
unnest_wider(dat) %>%
select(-end_value, everything(), end_value)
Upvotes: 1