How to subset and apply a function across the dataset

Question

I have a list of prices for different items in the same dataset.

abc1 <- c("2005-09-18", "ABC", 99.00)
abc2 <- c("2005-09-19", "ABC", 98.00)
abc3 <- c("2005-09-20", "ABC", 98.50)
abc4 <- c("2005-09-21", "ABC", 97.75)
def1 <- c("2005-09-14", "DEF", 79.00)
def2 <- c("2005-09-15", "DEF", 78.00)
def3 <- c("2005-09-16", "DEF", 78.50)
def4 <- c("2005-09-20", "DEF", 77.75)

df <- data.frame(rbind(abc1, abc2, abc3, abc4, def1, def2, def3, def4))

the above quick table would result in :

             X1  X2    X3
abc1 2005-09-18 ABC    99
abc2 2005-09-19 ABC    98
abc3 2005-09-20 ABC  98.5
abc4 2005-09-21 ABC 97.75
def1 2005-09-14 DEF    79
def2 2005-09-15 DEF    78
def3 2005-09-16 DEF  78.5
def4 2005-09-20 DEF 77.75

I would like to add a column, say X4, which would be the variation of today, versus the previous day, for a specific X2. So x4 would have the following value:

 X4
 0,0%
-1,0%
 0,5%
-0,8%
 0,0%
-1,3%
 0,6%
-1,0%

The goal would be to do that for all the different items in X3. Ideally without splitting the table. I think the date is always going to be in the right order, but just in case.

akrun · Accepted Answer

We can group by 'X2' and take the difference of adjacent elements with diff

library(dplyr)
df %>%
   group_by(X2) %>%
   mutate(X4 = c(0, diff(X3)))

Or after grouping by 'X2', take the difference between the 'X2' and the lag of 'X2'

df %>%
   group_by(X2) %>%
   mutate(X4 = X3 - lag(X3, default = first(X3)))

How to subset and apply a function across the dataset

Answers (2)

Related Questions