rawkz
rawkz

Reputation: 1

What I should do in order to find the average for a certain rows in multiple csv files?

I am trying to find the average based on rows in multiple CSV files. I have extracted the file from the directory and modify it to include only the 2 applicable columns. The problem is, I would like to find the average for the specific rows based on their values for all the 66 CSV files inside the directory. My code is basically stuck here:

# Set path to folder
folder.path <- getwd()

# Get list of csv files in folder
filenames <- list.files("Path", pattern = "*.csv", full.names = TRUE)

# Read all CSV files in the folder and create a list of data frames
ldf <- lapply(filenames, read.csv)

# Select hr and stimulus columns in each dataframe in the list
ldf <- lapply(ldf, "[", c("Variable1", "Variable2"))

#See the variable left in each CSV files
lapply(ldf, names)

Variable 1 is number, and variable 2 is text. I want to find the average of variable 2 by categorising variable 2. For example, the average of variable 1 when variable 2 is A, the average of variable 1 when variable 2 is B, etc., for all 66 CSV files.

Upvotes: 0

Views: 186

Answers (2)

akrun
akrun

Reputation: 887223

We can use data.table methods

library(data.table)
out <- lapply(ldf, function(x) setDT(x)[, 
        .(Variable1 = mean(Variable1, na.rm = TRUE)),
       by = Variable])

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 389047

You can use lapply to iterate over each dataframe in ldf and aggregate them to get mean value for Variable2 for each unique value in Variable1.

result <- lapply(ldf, function(x) aggregate(Variable1~Variable2, x, mean, na.rm = TRUE))

Using tidyverse you could do this as :

library(dplyr)
library(purrr)

result <- map(ldf, ~.x %>% 
                     group_by(Variable2) %>% 
                     summarise(Variable1 = mean(Variable1, na.rm = TRUE)))

Upvotes: 1

Related Questions