Reputation: 35
I expected cumulative sum in "n_rest"-column. But I get only the copy of "n_i"-column. My problem can be solved inserting "# as.data.frame() %>%" but I don't like this solution and I would like to understand the explanation of my mistake.
Thanks in advance!
library(dplyr)
t <- c(42,57,63,98,104,105,132,132,132,133,133,133,139,140,161,180,180,195,195,233)
status <- c(1 ,1 ,1 ,1 ,0 ,1 ,1 ,1 ,1 ,1 ,1 ,1 ,1 ,1 ,1 ,1 ,1 ,1 ,1 , 0)
KMP <- function(time,status){
n_ges = length(t)
df <- data.frame(t = t, status = status, n = 1)
df <- df %>% group_by(t,status) %>%
summarise(n_i = sum(n)) %>%
# as.data.frame() %>%
mutate(n_rest = rev(cumsum(n_i)))
df
}
Upvotes: 0
Views: 48
Reputation: 94307
The mutate
is still working on the groups.
By passing into as.data.frame
you are dropping the grouping. Alternatively reset the grouping by putting an empty group_by
in the pipe:
> df %>% group_by(t,status) %>% summarise(n_i=sum(n)) %>% group_by() %>% mutate(n_rest=cumsum(n_i))
# A tibble: 14 x 4
t status n_i n_rest
<dbl> <dbl> <dbl> <dbl>
1 42 1 1 1
2 57 1 1 2
3 63 1 1 3
4 98 1 1 4
5 104 0 1 5
6 105 1 1 6
7 132 1 3 9
8 133 1 3 12
9 139 1 1 13
10 140 1 1 14
11 161 1 1 15
12 180 1 2 17
13 195 1 2 19
14 233 0 1 20
Upvotes: 1
Reputation: 389275
That is because your dataframe is still grouped by t
. If you check output of
library(dplyr)
df %>% group_by(t,status) %>% summarise(n_i = sum(n))
# A tibble: 14 x 3
# Groups: t [14]
# t status n_i
# <dbl> <dbl> <dbl>
# 1 42 1 1
# 2 57 1 1
# 3 63 1 1
# 4 98 1 1
# 5 104 0 1
# 6 105 1 1
# 7 132 1 3
# 8 133 1 3
# 9 139 1 1
#10 140 1 1
#11 161 1 1
#12 180 1 2
#13 195 1 2
#14 233 0 1
From ?summarise
An object of the same class as .data. One grouping level will be dropped.
As you are grouping for t
and status
, grouping of status
is dropped keeping group_by
t
as it is, hence your cumsum
result is grouped by t
.
You can remove the effect of grouping by using ungroup
after summarise
df %>%
group_by(t,status) %>%
summarise(n_i = sum(n)) %>%
ungroup() %>%
mutate(n_rest = rev(cumsum(n_i)))
The same effect was achieved using as.data.frame()
in OP's code.
Upvotes: 0