Basil
Basil

Reputation: 13

A set of functions over multiple data frames and merge the outputs in R

I have multiple data frames (moving temperature of different duration at 130 observation points), and want to generate monthly average for all the data by applying the below code to each data frame - then put the outcome into one data frame. I have been trying to do this with for-loop, but not getting anywhere. I'm relatively new to R and really appreciate if someone could help me get through this.

Here is the glimpse of a data frame:

head(maxT2016[,1:5])

      X       X0       X1       X2       X3
1 20160101 26.08987 26.08987 26.08987 26.08987
2 20160102 25.58242 25.58242 25.58242 25.58242
3 20160103 25.44290 25.44290 25.44290 25.44290
4 20160104 26.88043 26.88043 26.88043 26.88043
5 20160105 26.60278 26.60278 26.60278 26.60278
6 20160106 24.87676 24.87676 24.87676 24.87676

str(maxT2016)
'data.frame':   274 obs. of  132 variables:
$ X   : int  20160101 20160102 20160103 20160104 20160105 20160106 20160107 20160108 20160109 20160110 ...

$ X0  : num  26.1 25.6 25.4 26.9 26.6 ...
$ X1  : num  26.1 25.6 25.4 26.9 26.6 ...
$ X2  : num  26.1 25.6 25.4 26.9 26.6 ...
$ X3  : num  26.1 25.6 25.4 26.9 26.6 ...

Here is the code that I use for individual data frame:

library(dplyr)
library(lubridate)
library(tidyverse)

maxT10$X <- as.Date(as.character(maxTsma10$X), format="%Y%m%d") 

monthlyAvr <- maxT10 %>%
  group_by(month=floor_date(date, "month")) %>%
  summarise(across(X0:X130, mean, na.rm=TRUE)) %>%
  slice_tail(n=6) %>%
  select(-month)

monthlyAvr2 <- as.data.frame(t(montlyAvr))
colnames(monthlyAvr2) <- c("meanT_Apr", "meanT_May", "meanT_Jun", "meanT_Jul", "meanT_Aug", 
"meanT_Sep")

Essentially, I want to put all the all the data frames into a list and run the function through all the data frame, then sort these outputs into one data frame. I came across with lapply function as an alternative, but somewhat felt more comfortable with for-loop.

d = list(maxT10, maxT20, maxT30, maxT60 ... ...)

for (i in 1:lengh(d)){

}

MonthlyAvrT <- cbind(maxT10, maxT20, maxT30, maxT60... ... ) 

Upvotes: 1

Views: 79

Answers (2)

Trusky
Trusky

Reputation: 503

The logic in pseudo-code would be:

for each data.frame in list
    apply a function
    save the results

Applying my_function on each data.frame of the data_set list :

my_function <- function(my_df) {

  my_df <- as.data.frame(my_df)
  out <- apply(my_df, 2, mean)  # compute mean on dimension 2 (columns)
  return(out)

}

# 100 data.frames
data_set <- replicate(100, data.frame(X=runif(6, 20160101, 20160131), X0=rnorm(6, 25)))
> dim(data_set) 
[1]   2 100
results <- apply(data_set, 2, my_function)  # Apply my_function on dimension 2

# Output for first 5 data.frames
> results[, 1:5]                                                                                                                                                                          
           [,1]         [,2]         [,3]         [,4]         [,5]                                                                  

X  2.016012e+07 2.016011e+07 2.016011e+07 2.016012e+07 2.016011e+07                                                                                                                       
X0 2.533888e+01 2.495086e+01 2.523087e+01 2.491822e+01 2.482142e+01

Upvotes: 0

Limey
Limey

Reputation: 12461

Basil. Welcome to StackOverflow.

I was wary of lapply when I first stated using R, but you should stick with it. It's almost always more efficient than using a for loop. In your particular case, you can put your individual data frames in a list and the code you run on each into a function myFunc, say, which takes the data frame you want to process as its argument.

Then you can simply say

allData <- bind_rows(lapply(1:length(dataFrameList), function(x) myFunc(dataFrameList[[x]])))

Incidentally, your column names make me think your data isn't yet tidy. I'd suggest you spend a little time making it so before you do much else. It will save you a huge amount of effort in the long run.

Upvotes: 2

Related Questions