Carlo
Carlo

Reputation: 155

Consecutive Sum of column x that relates to column y

In my dataset the Volume count always starts at 1 with every new Day. Like that:

Particle_Size Titrant      Day    Volume_titrant   
    <chr>         <chr>   <dbl>          <dbl>     
  1 > 40          NaOH        0              1     
  2 > 40          NaOH        0              2     
  3 > 40          NaOH        0              3     
  4 > 40          NaOH        0              4     
  5 > 40          NaOH        0              5     
 39 > 40          NaOH        8              1    
 40 > 40          NaOH        8              2     
 41 > 40          NaOH        8              3     

105 <60           NaOH        0              1     
106 <60           NaOH        0              2     
107 <60           NaOH        0              3     
131 <60           NaOH        5              2     
132 <60           NaOH        5              4    

What I want is an additional column that counts the total Volume over all days for each Particle_Size. Like that:

Particle_Size Titrant       Day     Volume_titrant  Volume_total
    <chr>         <chr>   <dbl>          <dbl>      <dbl>
  1 > 40          NaOH        0              1     1
  2 > 40          NaOH        0              2     2
  3 > 40          NaOH        0              3     3
  4 > 40          NaOH        0              4     4
  5 > 40          NaOH        0              5     5
 39 > 40          NaOH        8              1     6
 40 > 40          NaOH        8              2     7
 41 > 40          NaOH        8              3     8

105 <60           NaOH        0              1     1
106 <60           NaOH        0              2     2
107 <60           NaOH        0              3     3
131 <60           NaOH        5              2     5
132 <60           NaOH        5              4     9

I tried to group my dataset and then mutate a new column with cumsum(). This adds the Volume of several days together and also starts again at 1 with every new Particle_Size. So far so good, but I in the example you see that this doesn't count the Volume as I want it.

df = na.omit()%>%
     group_by(Particle_Size, Titrant)%>%
     mutate(Total_Volume = cumsum(Volume_titrant))
Particle_Size Titrant       Day     Volume_titrant   Cumsum
    <chr>         <chr>   <dbl>          <dbl>      <dbl>
  1 > 40          NaOH        0              1     1
  2 > 40          NaOH        0              2     3
  3 > 40          NaOH        0              3     6
  4 > 40          NaOH        0              4     10
  5 > 40          NaOH        0              5     15
 39 > 40          NaOH        8              1     16
 40 > 40          NaOH        8              2     18
 41 > 40          NaOH        8              3     21

105 <60           NaOH        0              1     1
106 <60           NaOH        0              2     3
107 <60           NaOH        0              3     6
131 <60           NaOH        5              2     8
132 <60           NaOH        5              4     12

Also tried it with rollsum() instead of cumsum(), but than I get a problem with the k argument. With k = 1 it just copies the Volume column and with any other k I get an error code like this:

Error: Problem with `mutate()` input `Total_Volume`.
x Input `Total_Volume` can't be recycled to size 48.
i Input `Total_Volume` is `rollsum(Volume_titrant, k = 2)`.
i Input `Total_Volume` must be size 48 or 1, not 47.
i The error occurred in group 1: Particle_Size = "<60", Titrant = "NaOH".

Is there a function out there or do I need a solution with a loop? Any ideas?

enter code here

Upvotes: 0

Views: 80

Answers (2)

AnilGoyal
AnilGoyal

Reputation: 26218

As I have already stated there seems some typo in your desired output. why row 132 desired output is 9? Whereas the desired output in rows 40 & 41 is 7 & 8 respectively. Either row 132 should be 7 or rows 40 & 41 should be 8 and 11 respectively? Your logic is somewhat confusing! Do you just want to add previous day's last value to current day? Try this syntax

library(dplyr)

df %>% left_join(df %>% group_by(Particle_Size, Day) %>%
                   summarise(last_val = last(Volume_titrant), .groups = 'drop') %>%
                   group_by(Particle_Size) %>%
                   mutate(last_val = cumsum(lag(last_val, default = 0))), by = c('Particle_Size', 'Day')) %>%
  mutate(desired_value = Volume_titrant + last_val)

#>    Particle_Size Titrant Day Volume_titrant last_val desired_value
#> 1            >40    NaOH   0              1        0             1
#> 2            >40    NaOH   0              2        0             2
#> 3            >40    NaOH   0              3        0             3
#> 4            >40    NaOH   0              4        0             4
#> 5            >40    NaOH   0              5        0             5
#> 6            >40    NaOH   8              1        5             6
#> 7            >40    NaOH   8              2        5             7
#> 8            >40    NaOH   8              3        5             8
#> 9            <60    NaOH   0              1        0             1
#> 10           <60    NaOH   0              2        0             2
#> 11           <60    NaOH   0              3        0             3
#> 12           <60    NaOH   5              2        3             5
#> 13           <60    NaOH   5              4        3             7

Created on 2021-06-04 by the reprex package (v2.0.0)

Upvotes: 2

Zaw
Zaw

Reputation: 1474

You could try using sum() in a mutate() call?

library(dplyr)

dat %>%
  group_by(Particle_Size) %>%
  mutate(Total_valume = sum(Volume_titrant)) %>% 
  ungroup()

# # A tibble: 13 x 5
#    Particle_Size Titrant   Day Volume_titrant Total_valume
#            <int> <chr>   <int>          <int>        <int>
#  1            40 NaOH        0              1           21
#  2            40 NaOH        0              2           21
#  3            40 NaOH        0              3           21
#  4            40 NaOH        0              4           21
#  5            40 NaOH        0              5           21
#  6            40 NaOH        8              1           21
#  7            40 NaOH        8              2           21
#  8            40 NaOH        8              3           21
#  9            60 NaOH        0              1           12
# 10            60 NaOH        0              2           12
# 11            60 NaOH        0              3           12
# 12            60 NaOH        5              2           12
# 13            60 NaOH        5              4           12

Upvotes: 0

Related Questions