Reputation: 209
I'm a beginner in R and I'm facing an issue.
Problem: I need to sort a dataframe by 2 columns (ID, i'th column) and then take lagged difference of the i'th column and record it. Then resort the data with the ID and the i+1 column and so on and so forth.
What I have written up till now:
for (val in (4:length(colnames(df)))){
df <- df[with(df, order(ID, df[val])), ]
d2_df <- df %>%
mutate_at(c(df[val]), list(lagged = ~ . - lag(.)))
}
The above code is messing somehow because the mutate_at function is throwing the error below:
Error: `.vars` must be a character/numeric vector or a `vars()` object, not a list.
Original dataset:
ID S1 S2
1 1 3 1
2 1 5 2
3 1 1 3
4 2 2 7
5 3 4 9
6 3 2 11
After Sort on ID and S1
ID S1 S2
1 1 1 3
2 1 3 1
3 1 5 2
4 2 2 7
5 3 2 11
6 3 4 9
Now what I need? S1.1 (which is the lagged difference of the sorted dataframe respective to each ID)
ID S1 S2 S1.1
1 1 1 3 NA
2 1 3 1 2
3 1 5 2 2
4 2 2 7 NA
5 3 2 11 NA
6 3 4 9 2
Similar logic applies for S2 where a new S2.2 will be generated. Any help would be immensely appreciated.
Additionally what is required (below); where sum.S1 is the sum of the lagged differences and count.S1 is the count of observations at S1 for respective ID:
ID sum.S1 sum.S2 count.S1 count.S2
1 1 4 2 3 3
2 2 NA NA 1 1
3 3 2 2 2 2
Upvotes: 0
Views: 193
Reputation: 389235
Here's a way using non-standard evaluation (NSE) :
library(dplyr)
library(purrr)
library(rlang)
cols <- c('S1', 'S2')
bind_cols(df, map_dfc(cols, ~{
col <- sym(.x)
df %>%
arrange(ID, !!col) %>%
group_by(ID) %>%
transmute(!!paste0(.x, '.1') := !!col - lag(!!col)) %>%
ungroup %>%
select(-ID)
}))
# ID S1 S2 S1.1 S2.1
#1 1 3 1 NA NA
#2 1 5 2 2 1
#3 1 1 3 2 1
#4 2 2 7 NA NA
#5 3 4 9 NA NA
#6 3 2 11 2 2
data
df <- structure(list(ID = c(1L, 1L, 1L, 2L, 3L, 3L), S1 = c(3L, 5L,
1L, 2L, 4L, 2L), S2 = c(1L, 2L, 3L, 7L, 9L, 11L)),
class = "data.frame", row.names = c(NA, -6L))
Upvotes: 1