Reputation: 1
I am trying to find the average based on rows in multiple CSV files. I have extracted the file from the directory and modify it to include only the 2 applicable columns. The problem is, I would like to find the average for the specific rows based on their values for all the 66 CSV files inside the directory. My code is basically stuck here:
# Set path to folder
folder.path <- getwd()
# Get list of csv files in folder
filenames <- list.files("Path", pattern = "*.csv", full.names = TRUE)
# Read all CSV files in the folder and create a list of data frames
ldf <- lapply(filenames, read.csv)
# Select hr and stimulus columns in each dataframe in the list
ldf <- lapply(ldf, "[", c("Variable1", "Variable2"))
#See the variable left in each CSV files
lapply(ldf, names)
Variable 1 is number, and variable 2 is text. I want to find the average of variable 2 by categorising variable 2. For example, the average of variable 1 when variable 2 is A, the average of variable 1 when variable 2 is B, etc., for all 66 CSV files.
Upvotes: 0
Views: 186
Reputation: 887223
We can use data.table
methods
library(data.table)
out <- lapply(ldf, function(x) setDT(x)[,
.(Variable1 = mean(Variable1, na.rm = TRUE)),
by = Variable])
Upvotes: 0
Reputation: 389047
You can use lapply
to iterate over each dataframe in ldf
and aggregate
them to get mean value for Variable2
for each unique value in Variable1
.
result <- lapply(ldf, function(x) aggregate(Variable1~Variable2, x, mean, na.rm = TRUE))
Using tidyverse
you could do this as :
library(dplyr)
library(purrr)
result <- map(ldf, ~.x %>%
group_by(Variable2) %>%
summarise(Variable1 = mean(Variable1, na.rm = TRUE)))
Upvotes: 1