mapra99
mapra99

Reputation: 53

How to make progressive operations in data frames with dplyr or similar R packages?

I have this dataframe:

df <- data.frame(a = c(1,2,3,4,5),
                 b = c(6,5,4,6,1))

I need to create a 'c' column that sums the i-th element from 'a' to the (i+1)-th element of 'b' and stores it in the i-th position of 'c', and the last element of 'c' would be equal to the value of its corresponding 'a' value. Inside a for loop, the code would be something like this:

#Initialize the 'c' column
df$c <- vector("double", nrow(df))

#For Loop
for(i in (1:(nrow(df)-1)){
 df$c[i] <- df$a[i] + df$b[i+1]
}
df$c[nrow(df)] <- df$a[nrow(df)]

I am familiarized with dplyr::mutate(), but I don't know how to replace that loop with this function. Is there any other function from dplyr or another package than can help me with this kind of operations?

Upvotes: 0

Views: 59

Answers (2)

chinsoon12
chinsoon12

Reputation: 25225

You can use data.table::shift to lead the b column to be summed with a:

dt[, C := ifelse(is.na(shift(b, type="lead")), a, a + shift(b, type="lead"))][]

Or using replace to handle the tail case:

dt[, C := {
        x <- shift(b, type="lead")
        a + replace(x, is.na(x), 0)
    }]

missed out the fill argument in shift (i.e. equivalent of default in dplyr::lead)

df[, C := a + shift(b, fill=0, type="lead")]

data:

library(data.table)

dt <- data.table(a = c(1,2,3,4,5),
                 b = c(6,5,4,6,1))

Upvotes: 1

Marius
Marius

Reputation: 60060

Using lead() in dplyr:

df %>%
    mutate(c = a + lead(b, default = 0))

Upvotes: 2

Related Questions