Reputation: 109
I am trying to get a balance value from a df that looks like this
df1
Name Year Ch1 Origin
A 1995 x1 a
A 1996 x2 b
A 1997 x3 a
A 2000 x4 a
B 1997 y1 c
B 1998 y2 c
.....
while Ch1 is numerical. and I want to add an extra col to have this value:
Name Year Ch1 Bil
A 1995 x1
A 1996 x2 %of year before (x2-x1/x1*100)%
A 1997 x3 %of year before (x3-x2/x2*100)%
A 2000 x4 %of year before (x4-x3/x3*100)%
B 1997 y1
B 1998 y2 %of year before (y2-y1/x1*100)%
.....
now I know I could create a loop looking something like this:
for (i in nrow(df1))
if (df[i,1]==df[i-1,1]) {
df$Bil<-(df[i,3]-df[i-1,3])/df[i-1,3]*100
} else ...
Is there a more elegant or quicker way to calculate this? This way I really need to make sure that the dataset is in the right order (going from older to recent years). Lets say also dependant on an extra detail such as origin so that the calculation only happens if name and origin are the same?
Thank you!
Upvotes: 1
Views: 46
Reputation: 160447
All three solutions require the data be in the correct order within each Name
. For instance, you can interleave names of "A" and "B" (all groupings below will handle it), but the years should likely be non-decreasing.
Prepping by putting in real Ch1
:
set.seed(42)
df1$Ch1 <- c(sort(sample(20, size=4)), sort(sample(20, size=2)))
df1 <- df1[order(df1$Name, df1$Year),]
df1$Bil <- ave(df1$Ch1, df1$Name, df1$Origin,
FUN=function(z) 100 * c(0, diff(z) / head(z, n = -1)))
df1
# Name Year Ch1 Origin Bil
# 1 A 1995 6 a 0.000000
# 2 A 1996 15 b 0.000000
# 3 A 1997 18 a 200.000000
# 4 A 2000 19 a 5.555556
# 5 B 1997 10 c 0.000000
# 6 B 1998 13 c 30.000000
library(dplyr)
df1 %>%
arrange(Name, Year) %>%
group_by(Name, Origin) %>%
mutate(
Bil = 100 * c(0, diff(Ch1) / head(Ch1, n = -1))
) %>%
ungroup()
library(data.table)
library(magrittr)
df1DT <- as.data.table(df1)
setorder(df1DT, Name, Year)
df1DT[, Bil := 100 * c(0, diff(Ch1) / head(Ch1, n = -1)), by = .(Name, Origin)]
Upvotes: 2