Automate this code for many similar excel files and output to one file

Question

import pandas as pd
import xlsxwriter as wr

rf = pd.read_excel('M07241.xlsx', skiprows=5)
rf = rf[['Name.2', 'Time.1']]
rf = rf.rename(columns = {'Name.2': 'participant', 'Time.1': 'reaction time'}, inplace = False)

part1=rf[rf['participant'].str.contains("PART 1")]['reaction time'].mean()
part2=rf[rf['participant'].str.contains("PART 2")]['reaction time'].mean()
part3=rf[rf['participant'].str.contains("PART 3")]['reaction time'].mean()

outWorkbook = wr.Workbook("maniout.xlsx")
outSheet = outWorkbook.add_worksheet()

part1mean = [part1]
part2mean = [part2]
part3mean = [part3]
outSheet.write("A1", "ps-part1")
outSheet.write("B1", "ps-part2")
outSheet.write("C1", "ps-part3")
outSheet.write(1, 0, part1mean[0])
outSheet.write(1, 1, part2mean[0])
outSheet.write(1, 2, part3mean[0])
outWorkbook.close()

The code reads and separates 'Part 1', 'Part 2', 'Part 3' from the first column, participant and outputs the average of each 'Part', for corresponding values in the second column reaction time, to another file("maniout.xlsx") under 3 columns named "ps-part1", "ps-part2", "ps-part3". I need to automate this for many similar files and have their result added/appended to "maniout.xlsx" under "ps-part1", "ps-part2", "ps-part3". Many Thanks to append the other files here

Deven Ramani · Accepted Answer

I am assuming your all input files are in a single folder and all having the same type of data format( 2 columns with name as 'Name.2', 'Time.1')

So, in this case, we will loop through all the file and find mean of each data as per your code and add to list after finding all mean will write it xlsx file via pandas dataframe

import pandas as pd

mean_data = []

for root, _, files in os.walk("Folder path"): <-- update folder path
    for file in files:
        if file.endswith(".xlsx"):
            rf = pd.read_excel(os.path.join(root, file), skiprows=5)
            rf = rf[['Name.2', 'Time.1']]
            rf = rf.rename(columns={'Name.2': 'participant', 'Time.1': 'reaction time'}, inplace=False)

            part1 = rf[rf['participant'].str.contains("PART 1")]['reaction time'].mean()
            part2 = rf[rf['participant'].str.contains("PART 2")]['reaction time'].mean()
            part3 = rf[rf['participant'].str.contains("PART 3")]['reaction time'].mean()

            mean_data.append([part1, part2, part3])

df = pd.DataFrame(mean_data, columns=['ps-part1', 'ps-part2', 'ps-part3'])
df.to_excel("folder path with output.xlsx", index=False) <-- update output file path

Automate this code for many similar excel files and output to one file

Answers (1)

Related Questions