Nautica
Nautica

Reputation: 2018

How to use lag/lead in mutate with an initial value?

Sample df:

library(tidyverse)

iris <- iris[1:10,]
iris$testlag <- NA
iris[[1,"testlag"]] <- 5

   Sepal.Length Sepal.Width Petal.Length Petal.Width Species testlag
1           5.1         3.5          1.4         0.2  setosa       5
2           4.9         3.0          1.4         0.2  setosa      NA
3           4.7         3.2          1.3         0.2  setosa      NA
4           4.6         3.1          1.5         0.2  setosa      NA
5           5.0         3.6          1.4         0.2  setosa      NA
6           5.4         3.9          1.7         0.4  setosa      NA
7           4.6         3.4          1.4         0.3  setosa      NA
8           5.0         3.4          1.5         0.2  setosa      NA
9           4.4         2.9          1.4         0.2  setosa      NA
10          4.9         3.1          1.5         0.1  setosa      NA

In the testlag column, I'm interesting in using dplyr::lag() to retrieve the previous value and add some column, for example Petal.Length to it. As I have only one initial value, each subsequent calculation requires it to work iteratively, so I thought something like mutate would work.

I first tried doing something like this:

iris %>% mutate_at("testlag", ~ lag(.) + Petal.Length)

But this removed the first value, and only gave a valid value for the second row and NAs for the rest. Intuitively I know why it's removing the first value, but I thought the nature of mutate would allow it to work for the rest of the values, so I don't know how to fix that.

Of course using base R I could something like:

for (idx in 2:nrow(iris)) {
  iris[[idx, "testlag"]] <-
    lag(iris$testlag)[idx] + iris[[idx, "Petal.Length"]]
}

But I would prefer to implement this in tidyverse syntax.

Edit: Desired output (from my for loop)

   Sepal.Length Sepal.Width Petal.Length Petal.Width Species testlag
1           5.1         3.5          1.4         0.2  setosa     5.0
2           4.9         3.0          1.4         0.2  setosa     6.4
3           4.7         3.2          1.3         0.2  setosa     7.7
4           4.6         3.1          1.5         0.2  setosa     9.2
5           5.0         3.6          1.4         0.2  setosa    10.6
6           5.4         3.9          1.7         0.4  setosa    12.3
7           4.6         3.4          1.4         0.3  setosa    13.7
8           5.0         3.4          1.5         0.2  setosa    15.2
9           4.4         2.9          1.4         0.2  setosa    16.6
10          4.9         3.1          1.5         0.1  setosa    18.1

Upvotes: 3

Views: 1914

Answers (2)

Shan R
Shan R

Reputation: 541

Does this work for you?

library(tidyverse)
library("data.table")

iris <- iris[1:10,]
iris$testlag <- NA
iris[[1,"testlag"]] <- 5

iris %>% mutate (testlag = lag(first(testlag) + cumsum(Petal.Length)))

Result:

   Sepal.Length Sepal.Width Petal.Length Petal.Width Species testlag
1           5.1         3.5          1.4         0.2  setosa      NA
2           4.9         3.0          1.4         0.2  setosa     6.4
3           4.7         3.2          1.3         0.2  setosa     7.8
4           4.6         3.1          1.5         0.2  setosa     9.1
5           5.0         3.6          1.4         0.2  setosa    10.6
6           5.4         3.9          1.7         0.4  setosa    12.0
7           4.6         3.4          1.4         0.3  setosa    13.7
8           5.0         3.4          1.5         0.2  setosa    15.1
9           4.4         2.9          1.4         0.2  setosa    16.6
10          4.9         3.1          1.5         0.1  setosa    18.0

Since technically there is no N-1 Petal length when N = 1, I left the first value of testlag NA. Do you really need it to be initial value? If you need, this will work:

iris %>% mutate (testlag = lag(first(testlag) + cumsum(Petal.Length), default=first(testlag)))

Upvotes: 3

Will Pike
Will Pike

Reputation: 280

The function you're looking for is tidyr::fill

library(tidyverse)

iris <- iris[1:10,]
iris$testlag <- NA
iris[[1,"testlag"]] <- 5

iris %>% fill(testlag, .direction = "down")
# Note the default is 'down', but I included here for completeness

This takes the specified column (testlag in this case), and copies any values in that column to the values below. This also works if you have a value in a subset of the rows: it copies the value down until it reaches a new value, then it picks up with that one.

For example:

library(tidyverse)

iris <- iris[1:10,]
iris$testlag <- NA
iris[[1,"testlag"]] <- 5
iris[[5,"testlag"]] <- 10
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species testlag
1           5.1         3.5          1.4         0.2  setosa       5
2           4.9         3.0          1.4         0.2  setosa      NA
3           4.7         3.2          1.3         0.2  setosa      NA
4           4.6         3.1          1.5         0.2  setosa      NA
5           5.0         3.6          1.4         0.2  setosa      10
6           5.4         3.9          1.7         0.4  setosa      NA
7           4.6         3.4          1.4         0.3  setosa      NA
8           5.0         3.4          1.5         0.2  setosa      NA
9           4.4         2.9          1.4         0.2  setosa      NA
10          4.9         3.1          1.5         0.1  setosa      NA

Applying this function...

iris %>% fill(testlag, .direction = "down")

Gives

   Sepal.Length Sepal.Width Petal.Length Petal.Width Species testlag
1           5.1         3.5          1.4         0.2  setosa       5
2           4.9         3.0          1.4         0.2  setosa       5
3           4.7         3.2          1.3         0.2  setosa       5
4           4.6         3.1          1.5         0.2  setosa       5
5           5.0         3.6          1.4         0.2  setosa      10
6           5.4         3.9          1.7         0.4  setosa      10
7           4.6         3.4          1.4         0.3  setosa      10
8           5.0         3.4          1.5         0.2  setosa      10
9           4.4         2.9          1.4         0.2  setosa      10
10          4.9         3.1          1.5         0.1  setosa      10

Upvotes: 0

Related Questions