Reputation: 4243
I have a dataframe like below:
Date <- as.Date(c('2017-10-16',
'2017-10-16',
'2017-10-17',
'2017-10-17',
'2017-10-18',
'2017-10-18',
'2017-10-19',
'2017-10-19',
'2017-10-20',
'2017-10-20'))
Source <- as.Date(c('2017-11-29',
'2017-11-30',
'2017-11-29',
'2017-11-30',
'2017-11-29',
'2017-11-30',
'2017-11-29',
'2017-11-30',
'2017-11-29',
'2017-11-30'))
Revenue <- c(206.88,
210.88,
194.13,
200.13,
170.00,
170.00,
746.65,
736.65,
772.00,
772.00)
Cost <- c(100.88,
10.88,
85.13,
100.13,
170.00,
100.00,
46.65,
50.65,
23.00,
24.00)
df <- data.frame(Date, Source, Revenue, Cost)
Dataframe:
df
Date Source Revenue Cost
1 2017-10-16 2017-11-29 206.88 100.88
2 2017-10-16 2017-11-30 210.88 10.88
3 2017-10-17 2017-11-29 194.13 85.13
4 2017-10-17 2017-11-30 200.13 100.13
5 2017-10-18 2017-11-29 170.00 170.00
6 2017-10-18 2017-11-30 170.00 100.00
7 2017-10-19 2017-11-29 746.65 46.65
8 2017-10-19 2017-11-30 736.65 50.65
9 2017-10-20 2017-11-29 772.00 23.00
10 2017-10-20 2017-11-30 772.00 24.00
How do I take the running difference by Date but do it for every column after the 2nd?
Final result needs to look like this:
Date Source Revenue Cost Revenue_Diff Cost_Diff .....................
1 2017-10-16 2017-11-29 206.88 100.88 NA NA .....................
2 2017-10-16 2017-11-30 210.88 10.88 4 -90 .....................
3 2017-10-17 2017-11-29 194.13 85.13 NA NA .....................
4 2017-10-17 2017-11-30 200.13 100.13 6 15 .....................
5 2017-10-18 2017-11-29 170.00 170.00 NA NA .....................
6 2017-10-18 2017-11-30 170.00 100.00 0 -70 .....................
7 2017-10-19 2017-11-29 746.65 46.65 NA NA .....................
8 2017-10-19 2017-11-30 736.65 50.65 -10 4 .....................
9 2017-10-20 2017-11-29 772.00 23.00 NA NA .....................
10 2017-10-20 2017-11-30 772.00 24.00 0 1 .....................
My current script can only do 1 column at a time but I want to be able to do everything to the right of Source:
test<-df %>%
group_by(Date) %>%
mutate(Revenue_Diff = c(NA, diff(`Revenue`)))
Any help would be great, thanks!
Upvotes: 0
Views: 158
Reputation: 39154
A solution using dplyr
. We can use mutate_at
to specify the columns to conduct operations. lag
can change the position of the values for the calculation.
library(dplyr)
df2 <- df %>%
group_by(Date) %>%
mutate_at(vars(-Source), funs(Diff = . - lag(.))) %>%
ungroup()
df2
# # A tibble: 10 x 6
# Date Source Revenue Cost Revenue_Diff Cost_Diff
# <date> <date> <dbl> <dbl> <dbl> <dbl>
# 1 2017-10-16 2017-11-29 206.88 100.88 NA NA
# 2 2017-10-16 2017-11-30 210.88 10.88 4 -90
# 3 2017-10-17 2017-11-29 194.13 85.13 NA NA
# 4 2017-10-17 2017-11-30 200.13 100.13 6 15
# 5 2017-10-18 2017-11-29 170.00 170.00 NA NA
# 6 2017-10-18 2017-11-30 170.00 100.00 0 -70
# 7 2017-10-19 2017-11-29 746.65 46.65 NA NA
# 8 2017-10-19 2017-11-30 736.65 50.65 -10 4
# 9 2017-10-20 2017-11-29 772.00 23.00 NA NA
# 10 2017-10-20 2017-11-30 772.00 24.00 0 1
Upvotes: 1