console.log
console.log

Reputation: 184

Calculate average of values in R and add result as new rows instead of as a new column

I have a dataframe like the following one:

day year    value
1   2014    5   
1   2015    16  
1   2016    0   
2   2014    3   
2   2015    1   
2   2016    4   

and I want to calculate the average value by day for the three year period (2014, 2015, 2016). The following code works for this purpose:

data  %>% 
  group_by(day) %>% 
  mutate(MEAN = mean(value)) 

and produces this output:

day year    value  MEAN
1   2014    5      7
1   2015    16     7
1   2016    0      7
2   2014    3      3
2   2015    1      3
2   2016    4      3

but I want to add the average values as new rows in the same dataframe as follows:

day year    value
1   2014    5   
1   2015    16  
1   2016    0   
2   2014    3   
2   2015    1   
2   2016    4
1   avg     7 <--
2   avg     3 <--

Any suggestions about how can I possibly do this? Thanks!

Upvotes: 1

Views: 2552

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 102579

Here is a base R option using aggregate

rbind(df,cbind(aggregate(value~day,df,mean),year = "avg")[c(1,3,2)])

or a variation (by @thelatemail from comments)

rbind(df, aggregate(df["value"], cbind(df["day"], year="avg"), FUN=mean))

which gives

  day year     value
1   1 2014  5.000000
2   1 2015 16.000000
3   1 2016  0.000000
4   2 2014  3.000000
5   2 2015  1.000000
6   2 2016  4.000000
7   1  avg  7.000000
8   2  avg  2.666667

Upvotes: 1

akrun
akrun

Reputation: 887741

We can use summarise (instead of mutate - which adds a new column in the original dataset) to calculate the mean and then with bind_rows can bind with original data. The tidyverse functions are very particular about type, so make sure the class are the same before we do the binding

library(dplyr)
data %>%
   group_by(day) %>%
   summarise(year = 'avg', value = mean(value)) %>% 
   bind_rows(data %>% 
               mutate(year = as.character(year)), .)
#  day year value
#1   1 2014  5.00
#2   1 2015 16.00
#3   1 2016  0.00
#4   2 2014  3.00
#5   2 2015  1.00
#6   2 2016  4.00
#7   1  avg  7.00
#8   2  avg  2.67

Another option is to split by the 'day' and then with add_row (from tibble) create a new row on each of the list elements

library(tibble)
library(purrr)
data %>% 
  mutate(year = as.character(year)) %>% 
  group_split(day) %>% 
  map_dfr(~ .x %>% add_row(day = first(.$day),
       year = 'avg', value = mean(.$value)))

Upvotes: 2

Related Questions