Reputation: 3
Let's say I have a data set looking below. The text repeats 'a' to 'e' having value for each month. It lasts from 2016 Jan to 2016 March.
date text value
1-16 a 13
1-16 b 2
1-16 c 3
1-16 d 1
1-16 e 20
2-16 a 30
2-16 b 50
2-16 c 20
2-16 d 10
2-16 e 40
3-16 a 34
3-16 b 3
3-16 c 2
3-16 d 1
3-16 e 4
I want to make a column having value of difference by month and by the text. What I mean is, 17 = (30(a, 2-16) - 13(a, 1-16)) like this,
date text value the value that I want to have
1-16 a 13 na
1-16 b 2 na
1-16 c 3 na
1-16 d 1 na
1-16 e 20 na
2-16 a 30 =(value(a, 2016-feb) - value(a, 2016-jan)) 17
2-16 b 50 =(value(b, 2016-feb) - value(b, 2016-jan)) 48
2-16 c 20 =(value(c, 2016-feb) - value(c, 2016-jan)) 17
2-16 d 10 =(value(d, 2016-feb) - value(d, 2016-jan)) 9
2-16 e 40 =(value(e, 2016-feb) - value(e, 2016-jan)) 20
3-16 a 34 =(value(a, 2016-mar) - value(a, 2016-feb)) 4
3-16 b 3 =(value(b, 2016-mar) - value(b, 2016-feb)) -47
3-16 c 2 =(value(c, 2016-mar) - value(c, 2016-feb)) -18
3-16 d 1 =(value(d, 2016-mar) - value(d, 2016-feb)) -9
3-16 e 4 =(value(e, 2016-mar) - value(e, 2016-feb)) -36
I made the data above. The actual data is longer and even more complex as it contains approximately 2000 text for each month and the data is not even in date. (It's also divided by many categories). So some solutions that calculate the value manually would not work.
I tried with dplyr lag, for example,
df %>% group_by(date, text) %>%
arrange(date, text) %>%
mutate(diff = value - lag(value))
but it did not work. I assume the code did not recognise text well?
What would be the best way to do it?
Upvotes: 0
Views: 527
Reputation: 61154
Or completely in R base using ave
> df$newVal <- ave(df$value, df$text, FUN=function(x) c(NA,diff(x)))
> df
date text value newVal
1 1-16 a 13 NA
2 1-16 b 2 NA
3 1-16 c 3 NA
4 1-16 d 1 NA
5 1-16 e 20 NA
6 2-16 a 30 17
7 2-16 b 50 48
8 2-16 c 20 17
9 2-16 d 10 9
10 2-16 e 40 20
11 3-16 a 34 4
12 3-16 b 3 -47
13 3-16 c 2 -18
14 3-16 d 1 -9
15 3-16 e 4 -36
with aggregate
df$newval <- c(aggregate(value ~ text, data=df, FUN=function(x) c(NA,diff(x)))[,-1])
Upvotes: 0
Reputation: 18425
With dplyr
...
library(dplyr)
df %>% group_by(text) %>% mutate(newval=c(NA,diff(value)))
date text value newval
<chr> <chr> <int> <int>
1 1-16 a 13 NA
2 1-16 b 2 NA
3 1-16 c 3 NA
4 1-16 d 1 NA
5 1-16 e 20 NA
6 2-16 a 30 17
7 2-16 b 50 48
8 2-16 c 20 17
9 2-16 d 10 9
10 2-16 e 40 20
11 3-16 a 34 4
12 3-16 b 3 -47
13 3-16 c 2 -18
14 3-16 d 1 -9
15 3-16 e 4 -36
Upvotes: 1