Reputation: 155
In my dataset the Volume count always starts at 1 with every new Day. Like that:
Particle_Size Titrant Day Volume_titrant
<chr> <chr> <dbl> <dbl>
1 > 40 NaOH 0 1
2 > 40 NaOH 0 2
3 > 40 NaOH 0 3
4 > 40 NaOH 0 4
5 > 40 NaOH 0 5
39 > 40 NaOH 8 1
40 > 40 NaOH 8 2
41 > 40 NaOH 8 3
105 <60 NaOH 0 1
106 <60 NaOH 0 2
107 <60 NaOH 0 3
131 <60 NaOH 5 2
132 <60 NaOH 5 4
What I want is an additional column that counts the total Volume over all days for each Particle_Size
. Like that:
Particle_Size Titrant Day Volume_titrant Volume_total
<chr> <chr> <dbl> <dbl> <dbl>
1 > 40 NaOH 0 1 1
2 > 40 NaOH 0 2 2
3 > 40 NaOH 0 3 3
4 > 40 NaOH 0 4 4
5 > 40 NaOH 0 5 5
39 > 40 NaOH 8 1 6
40 > 40 NaOH 8 2 7
41 > 40 NaOH 8 3 8
105 <60 NaOH 0 1 1
106 <60 NaOH 0 2 2
107 <60 NaOH 0 3 3
131 <60 NaOH 5 2 5
132 <60 NaOH 5 4 9
I tried to group my dataset and then mutate a new column with cumsum()
.
This adds the Volume of several days together and also starts again at 1 with every new Particle_Size. So far so good, but I in the example you see that this doesn't count the Volume as I want it.
df = na.omit()%>%
group_by(Particle_Size, Titrant)%>%
mutate(Total_Volume = cumsum(Volume_titrant))
Particle_Size Titrant Day Volume_titrant Cumsum
<chr> <chr> <dbl> <dbl> <dbl>
1 > 40 NaOH 0 1 1
2 > 40 NaOH 0 2 3
3 > 40 NaOH 0 3 6
4 > 40 NaOH 0 4 10
5 > 40 NaOH 0 5 15
39 > 40 NaOH 8 1 16
40 > 40 NaOH 8 2 18
41 > 40 NaOH 8 3 21
105 <60 NaOH 0 1 1
106 <60 NaOH 0 2 3
107 <60 NaOH 0 3 6
131 <60 NaOH 5 2 8
132 <60 NaOH 5 4 12
Also tried it with rollsum()
instead of cumsum(), but than I get a problem with the k
argument. With k = 1
it just copies the Volume column and with any other k
I get an error code like this:
Error: Problem with `mutate()` input `Total_Volume`.
x Input `Total_Volume` can't be recycled to size 48.
i Input `Total_Volume` is `rollsum(Volume_titrant, k = 2)`.
i Input `Total_Volume` must be size 48 or 1, not 47.
i The error occurred in group 1: Particle_Size = "<60", Titrant = "NaOH".
Is there a function out there or do I need a solution with a loop? Any ideas?
enter code here
Upvotes: 0
Views: 80
Reputation: 26218
As I have already stated there seems some typo in your desired output. why row 132 desired output is 9? Whereas the desired output in rows 40 & 41 is 7 & 8 respectively. Either row 132 should be 7 or rows 40 & 41 should be 8 and 11 respectively? Your logic is somewhat confusing! Do you just want to add previous day's last value to current day? Try this syntax
library(dplyr)
df %>% left_join(df %>% group_by(Particle_Size, Day) %>%
summarise(last_val = last(Volume_titrant), .groups = 'drop') %>%
group_by(Particle_Size) %>%
mutate(last_val = cumsum(lag(last_val, default = 0))), by = c('Particle_Size', 'Day')) %>%
mutate(desired_value = Volume_titrant + last_val)
#> Particle_Size Titrant Day Volume_titrant last_val desired_value
#> 1 >40 NaOH 0 1 0 1
#> 2 >40 NaOH 0 2 0 2
#> 3 >40 NaOH 0 3 0 3
#> 4 >40 NaOH 0 4 0 4
#> 5 >40 NaOH 0 5 0 5
#> 6 >40 NaOH 8 1 5 6
#> 7 >40 NaOH 8 2 5 7
#> 8 >40 NaOH 8 3 5 8
#> 9 <60 NaOH 0 1 0 1
#> 10 <60 NaOH 0 2 0 2
#> 11 <60 NaOH 0 3 0 3
#> 12 <60 NaOH 5 2 3 5
#> 13 <60 NaOH 5 4 3 7
Created on 2021-06-04 by the reprex package (v2.0.0)
Upvotes: 2
Reputation: 1474
You could try using sum()
in a mutate()
call?
library(dplyr)
dat %>%
group_by(Particle_Size) %>%
mutate(Total_valume = sum(Volume_titrant)) %>%
ungroup()
# # A tibble: 13 x 5
# Particle_Size Titrant Day Volume_titrant Total_valume
# <int> <chr> <int> <int> <int>
# 1 40 NaOH 0 1 21
# 2 40 NaOH 0 2 21
# 3 40 NaOH 0 3 21
# 4 40 NaOH 0 4 21
# 5 40 NaOH 0 5 21
# 6 40 NaOH 8 1 21
# 7 40 NaOH 8 2 21
# 8 40 NaOH 8 3 21
# 9 60 NaOH 0 1 12
# 10 60 NaOH 0 2 12
# 11 60 NaOH 0 3 12
# 12 60 NaOH 5 2 12
# 13 60 NaOH 5 4 12
Upvotes: 0