ament
ament

Reputation: 83

summarize across multiple cases of a variable

I'm looking for a more elegant way to summarize over unique cases of a variable, based on multiple criteria. My example below achieves what I want in the dams object, but I'm looking to simplify this into a single statement. Note that I filter for different ranges of JulianDay across different cases of Dam in my two intermediate summary objects (BON and MCN) that are joined to create the desired outcome in the dams object. Seems like the across() function would be part of the the solution, but I haven't figured it out yet.

dam_counts
# A tibble: 364,689 x 10
   Dam   DataDate             Year Month   Day JulianDay Species   LifeStage ClipStatus Count
   <chr> <dttm>              <dbl> <dbl> <int>     <dbl> <chr>     <chr>     <chr>      <dbl>
 1 BON   2014-01-01 00:00:00  2014     1     1         1 Coho      Adult     Total          1
 2 BON   2014-01-01 00:00:00  2014     1     1         1 Coho      Jack      Total         -1
 3 BON   2014-01-01 00:00:00  2014     1     1         1 Sockeye   Adult     Total          0
 4 BON   2014-01-01 00:00:00  2014     1     1         1 Steelhead Adult     Total          1
 5 BON   2014-01-01 00:00:00  2014     1     1         1 Steelhead Adult     Unclipped      0
 6 BON   2014-01-01 00:00:00  2014     1     1         1 Pink      NA        Total          0
 7 BON   2014-01-01 00:00:00  2014     1     1         1 shad      NA        Total          0
 8 BON   2014-01-01 00:00:00  2014     1     1         1 Chum      NA        Total          0
 9 BON   2014-01-01 00:00:00  2014     1     1         1 Chinook   Minijack  Total          0
10 BON   2014-01-01 00:00:00  2014     1     1         1 Lamprey   NA        Total          0
# ... with 364,679 more rows

> BON <- dam_counts %>% 
+   filter(Year %in% 2015:2021, JulianDay %in% 1:167, Dam == "BON", Species == "Chinook", LifeStage == "Adult") %>%
+   group_by(Year) %>%
+   summarize(BON=sum(Count))
> BON
# A tibble: 7 x 2
   Year    BON
  <dbl>  <dbl>
1  2015 265558
2  2016 172614
3  2017 107524
4  2018 108045
5  2019  71235
6  2020  79714
7  2021  87233

> MCN <- dam_counts %>% 
+   filter(Year %in% 2015:2021, JulianDay %in% 1:175, Dam == "MCN", Species == "Chinook", LifeStage == "Adult") %>%
+   group_by(Year) %>%
+   summarize(MCN=sum(Count))
> MCN
# A tibble: 7 x 2
   Year    MCN
  <dbl>  <dbl>
1  2015 187292
2  2016 116003
3  2017  62439
4  2018  60787
5  2019  46994
6  2020  54220
7  2021  64891

> dams <- left_join(BON, MCN, by = "Year")
> dams
# A tibble: 7 x 3
   Year    BON    MCN
  <dbl>  <dbl>  <dbl>
1  2015 265558 187292
2  2016 172614 116003
3  2017 107524  62439
4  2018 108045  60787
5  2019  71235  46994
6  2020  79714  54220
7  2021  87233  64891

Upvotes: 2

Views: 68

Answers (2)

ament
ament

Reputation: 83

Thanks @jpdugo17, that took me in the right direction. Using your map2() approach, here is what gets me what I need.

dam_counts
# A tibble: 364,689 x 10
   Dam   DataDate             Year Month   Day JulianDay Species   LifeStage ClipStatus Count
   <chr> <dttm>              <dbl> <dbl> <int>     <dbl> <chr>     <chr>     <chr>      <dbl>
 1 BON   2014-01-01 00:00:00  2014     1     1         1 Coho      Adult     Total          1
 2 BON   2014-01-01 00:00:00  2014     1     1         1 Coho      Jack      Total         -1
 3 BON   2014-01-01 00:00:00  2014     1     1         1 Sockeye   Adult     Total          0
 4 BON   2014-01-01 00:00:00  2014     1     1         1 Steelhead Adult     Total          1
 5 BON   2014-01-01 00:00:00  2014     1     1         1 Steelhead Adult     Unclipped      0
 6 BON   2014-01-01 00:00:00  2014     1     1         1 Pink      NA        Total          0
 7 BON   2014-01-01 00:00:00  2014     1     1         1 shad      NA        Total          0
 8 BON   2014-01-01 00:00:00  2014     1     1         1 Chum      NA        Total          0
 9 BON   2014-01-01 00:00:00  2014     1     1         1 Chinook   Minijack  Total          0
10 BON   2014-01-01 00:00:00  2014     1     1         1 Lamprey   NA        Total          0
# ... with 364,679 more rows

dam_names<-c("BON", "MCN")
chs_count_julian_days<-list(BON=1:167, MCN=1:175)

Year_start<-2008
Year_end<-2021
spCK_adult_annual <- 
  map2(dam_names, chs_count_julian_days, ~
         filter(dam_counts, Year %in% Year_start:Year_end, JulianDay %in% ..2, Dam == ..1, 
                Species == "Chinook", LifeStage == "Adult") %>%
         group_by(Year) %>%
         summarize('{..1}' := sum(Count)) %>% 
         select(-Year)) %>% 
  set_names(dam_names) %>% 
  as_tibble() %>% 
  mutate(Year=Year_start:Year_end, .before=everything())

Upvotes: 1

jpdugo17
jpdugo17

Reputation: 7106

We can use map2() function from purrr package.

library(tidyverse)

dam_counts <-
  read_table("Year Month   Day JulianDay Species   LifeStage ClipStatus Count
2014     1     1         1 Coho      Adult     Total          1
2015     1     1         1 Coho      Jack      Total         -1
2014     1     1         1 Sockeye   Adult     Total          0
2014     1     1         1 Steelhead Adult    Total          1
2014     1     1         1 Steelhead Adult     Unclipped      0
2014     1     1         1 Pink      NA        Total          0
2014     1     1         1 shad      NA        Total          0
2014     1     1         1 Chum      NA        Total          0
2015     1     1         1 Chinook   Adult  Total          0
2015     1     1         1 Chinook   Adult  Total          0")

dam_counts <-
  dam_counts %>%
  mutate(Dam = c(rep("BON", 9), "MCN")) %>%
  select(Dam, everything())


summs <- 
map2(c("BON", "MCN"), list(1:167, 1:175), ~
          filter(dam_counts, Year %in% 2014:2021, JulianDay %in% ..2, Dam == ..1, Species == "Chinook", LifeStage == "Adult") %>%
          group_by(Year) %>%
          summarize('{..1}' := sum(Count))) %>% 
  set_names(c('BON', 'MCN'))

left_join(summs$BON, summs$MCN, by = "Year")
#> # A tibble: 1 × 3
#>    Year   BON   MCN
#>   <dbl> <dbl> <dbl>
#> 1  2015     0     0

Created on 2021-11-20 by the reprex package (v2.0.1)

Upvotes: 1

Related Questions