Reputation: 13
I have multiple data frames (moving temperature of different duration at 130 observation points), and want to generate monthly average for all the data by applying the below code to each data frame - then put the outcome into one data frame. I have been trying to do this with for-loop, but not getting anywhere. I'm relatively new to R and really appreciate if someone could help me get through this.
Here is the glimpse of a data frame:
head(maxT2016[,1:5])
X X0 X1 X2 X3
1 20160101 26.08987 26.08987 26.08987 26.08987
2 20160102 25.58242 25.58242 25.58242 25.58242
3 20160103 25.44290 25.44290 25.44290 25.44290
4 20160104 26.88043 26.88043 26.88043 26.88043
5 20160105 26.60278 26.60278 26.60278 26.60278
6 20160106 24.87676 24.87676 24.87676 24.87676
str(maxT2016)
'data.frame': 274 obs. of 132 variables:
$ X : int 20160101 20160102 20160103 20160104 20160105 20160106 20160107 20160108 20160109 20160110 ...
$ X0 : num 26.1 25.6 25.4 26.9 26.6 ...
$ X1 : num 26.1 25.6 25.4 26.9 26.6 ...
$ X2 : num 26.1 25.6 25.4 26.9 26.6 ...
$ X3 : num 26.1 25.6 25.4 26.9 26.6 ...
Here is the code that I use for individual data frame:
library(dplyr)
library(lubridate)
library(tidyverse)
maxT10$X <- as.Date(as.character(maxTsma10$X), format="%Y%m%d")
monthlyAvr <- maxT10 %>%
group_by(month=floor_date(date, "month")) %>%
summarise(across(X0:X130, mean, na.rm=TRUE)) %>%
slice_tail(n=6) %>%
select(-month)
monthlyAvr2 <- as.data.frame(t(montlyAvr))
colnames(monthlyAvr2) <- c("meanT_Apr", "meanT_May", "meanT_Jun", "meanT_Jul", "meanT_Aug",
"meanT_Sep")
Essentially, I want to put all the all the data frames into a list and run the function through all the data frame, then sort these outputs into one data frame. I came across with lapply function as an alternative, but somewhat felt more comfortable with for-loop.
d = list(maxT10, maxT20, maxT30, maxT60 ... ...)
for (i in 1:lengh(d)){
}
MonthlyAvrT <- cbind(maxT10, maxT20, maxT30, maxT60... ... )
Upvotes: 1
Views: 79
Reputation: 503
The logic in pseudo-code would be:
for each data.frame in list
apply a function
save the results
Applying my_function
on each data.frame
of the data_set
list :
my_function <- function(my_df) {
my_df <- as.data.frame(my_df)
out <- apply(my_df, 2, mean) # compute mean on dimension 2 (columns)
return(out)
}
# 100 data.frames
data_set <- replicate(100, data.frame(X=runif(6, 20160101, 20160131), X0=rnorm(6, 25)))
> dim(data_set) [1] 2 100
results <- apply(data_set, 2, my_function) # Apply my_function on dimension 2
# Output for first 5 data.frames
> results[, 1:5] [,1] [,2] [,3] [,4] [,5] X 2.016012e+07 2.016011e+07 2.016011e+07 2.016012e+07 2.016011e+07 X0 2.533888e+01 2.495086e+01 2.523087e+01 2.491822e+01 2.482142e+01
Upvotes: 0
Reputation: 12461
Basil. Welcome to StackOverflow.
I was wary of lapply
when I first stated using R, but you should stick with it. It's almost always more efficient than using a for loop. In your particular case, you can put your individual data frames in a list
and the code you run on each into a function myFunc
, say, which takes the data frame you want to process as its argument.
Then you can simply say
allData <- bind_rows(lapply(1:length(dataFrameList), function(x) myFunc(dataFrameList[[x]])))
Incidentally, your column names make me think your data isn't yet tidy. I'd suggest you spend a little time making it so before you do much else. It will save you a huge amount of effort in the long run.
Upvotes: 2