Raul S.
Raul S.

Reputation: 85

Create a new tibble using the previous row value of a column as a parameter of the current row value

I want to manually create a tibble where one column values are calculated depending on the previous value of the same column.

For example:

tibble(
  x = 1:5,
  y = x + lag(y, default = 0)
)

I expect the following result:

# A tibble: 5 x 2
      x     y
  <int> <dbl>
1     1     1
2     2     3
3     3     6
4     4    10
5     5    15

But I obtain the error:

Error in lag(y, default = 0) : object 'y' not found

Update - more real example:

tibble(
  years = 1:5,
  salary = 20000 * (1.01) ^ lag(years, default = 0),
  qta = salary * 0.06
) %>%
  mutate(
    total = ifelse(row_number() == 1,
                   (qta + 50000) * (1.02),
                   (qta + lag(total, default = 0)) * (1.02))
  )

In this example I have a tibble, and I want to add a column 'total' that is defined depending on its previous value, but the lag(total, default = 0) doesn't work.

Upvotes: 1

Views: 431

Answers (2)

akrun
akrun

Reputation: 887971

We can use accumulate

library(tidyverse)
tibble(x = 1:5, y = accumulate(x, `+`))
# A tibble: 5 x 2
#      x     y
#  <int> <int>
#1     1     1
#2     2     3
#3     3     6
#4     4    10
#5     5    15

For a general function, it would be

tibble(x = 1:5, y = accumulate(x, ~ .x + .y))

We can also specify the initialization value

tibble(x = 1:5, y = accumulate(x[-1], ~ .x + .y, .init = x[1]))

Upvotes: 2

Nick
Nick

Reputation: 286

You're missing x instead of y in the lag() function to run without an error:

tibble(
 x = 1:5,
 y = x + lag(x, default = 0)
)

But as per @Ronak Shah's comment, you need the cumsum() function to get the same result as your example:

tibble(
 x = 1:5, 
 y = cumsum(x)
)

Upvotes: 1

Related Questions