Reputation: 1158
I am trying to condense my data by calculating the mean of every 15 rows in my data set, by doing this :
n<-15
aggregate(df[c("columnC", "ColumnD")],list(rep(1:(nrow(df)%/%n+1),each=n,len=nrow(df))),mean)[-1]
This works, but the problem is I have 2 other columns that are discrete values, and obviously I cannot take the mean of discrete values, and the code above cuts out the other columns and only has columnC and columnD. How can I do this so that for any of the discrete values, I just take the value of the 15th rows?
For example, if I have data like this :
1 Sunday Evening 16.2 235.84
2 Sunday Evening 23.4 235.29
3 Sunday Evening 29.4 232.79
4 Sunday Evening 24.2 233.89
5 Sunday Evening 24.2 233.66
6 Sunday Evening 24.2 233.38
7 Sunday Evening 24.2 232.99
8 Sunday Evening 25.4 233.21
9 Sunday Evening 26.8 232.37
10 Sunday Night 25.6 231.55
11 Sunday Night 24.4 231.19
12 Sunday Night 24.4 231.63
13 Sunday Night 24.4 231.71
14 Sunday Night 25.2 231.23
15 Sunday Night 25.2 231.23
I would want to take the mean of the third and 4th column, and for the 1st and 2nd column I'd be happy with "Sunday" and "Night" because those are what the values are on the 15th row.
Upvotes: 0
Views: 105
Reputation: 388982
Just to simplify, for the example you shared I took n = 3
and used dplyr
in the following way
library(dplyr)
n <- 3
df %>%
group_by(group = rep(1:(nrow(df)%/%n+1),each=n,len=nrow(df))) %>%
summarise(three_mean = mean(V3),
four_mean = mean(V4),
last_v1 = last(V1),
last_v2 = last(V2))
# group three_mean four_mean last_v1 last_v2
# <int> <dbl> <dbl> <fct> <fct>
#1 1 23.0 235 Sunday Evening
#2 2 24.2 234 Sunday Evening
#3 3 25.5 233 Sunday Evening
#4 4 24.8 231 Sunday Night
#5 5 24.9 231 Sunday Night
This returns mean of every 3 rows for column 3 and 4 and takes the last values for column 1 and 2.
For your real example, this should work if you change n
to 15.
Upvotes: 1