user2786962
user2786962

Reputation: 469

Forming a new variable based on previous row value

My data is of the following form

structure(list(Flag = c(1, 0, 0, 1, 0, 0, 1, 0), variable = c(3, 
8, 6, 7, 1, 4, 3, 6), sale = c(26, 27, 61, 38, 79, 87, 81, 13
)), .Names = c("Flag", "variable", "sale"), row.names = c(NA, 
-8L), class = "data.frame")

And I want to create output as follows

structure(list(Flag = c(1, 0, 0, 1, 0, 0, 1, 0), variable = c(3, 
8, 6, 7, 1, 4, 3, 6), sale = c(26, 27, 61, 38, 79, 87, 81, 13
), begin = c(3, -23, -50, 7, -31, -70, 3, -78), end = c(-23, 
-50, -111, -31, -70, -151, -78, -91)), .Names = c("Flag", "variable", 
"sale", "begin", "end"), row.names = c(NA, -8L), class = "data.frame")

where the ne column begin and end are based on the following algorathim

if flag=1 then 
    begin=variable;
    end=variable-sale;
----------
else
begin=lag(end) ( i.e the previous value of end variable)
end= lag(end)-sale

What I want is when flag is 1 the value of "begin" is equalt to "variable" value and "end" value is "variable-sale" value. Where as for the others the value of begin is the previous row "end" value and "end" value is (begin-sales) value Can anyone help me how to write achieve this in R?

Upvotes: 0

Views: 453

Answers (1)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

I think the example output you provide is incorrect, but I would try the following:

beginEnd <- by(indf, cumsum(indf$Flag), FUN = function(x) {
  out <- Reduce("-", c(x[, "variable"][1], x[, "sale"]), accumulate = TRUE)
  cbind(begin = head(out, -1), 
        end = tail(out, -1))
})
cbind(indf, do.call(rbind, beginEnd))
#   Flag variable sale begin  end
# 1    1        3   26     3  -23
# 2    0        8   27   -23  -50
# 3    0        6   61   -50 -111
# 4    1        7   38     7  -31
# 5    0        1   79   -31 -110
# 6    0        4   87  -110 -197
# 7    1        3   81     3  -78
# 8    0        6   13   -78  -91

Upvotes: 3

Related Questions