Reputation: 77
import pandas as pd
import xlsxwriter as wr
rf = pd.read_excel('M07241.xlsx', skiprows=5)
rf = rf[['Name.2', 'Time.1']]
rf = rf.rename(columns = {'Name.2': 'participant', 'Time.1': 'reaction time'}, inplace = False)
part1=rf[rf['participant'].str.contains("PART 1")]['reaction time'].mean()
part2=rf[rf['participant'].str.contains("PART 2")]['reaction time'].mean()
part3=rf[rf['participant'].str.contains("PART 3")]['reaction time'].mean()
outWorkbook = wr.Workbook("maniout.xlsx")
outSheet = outWorkbook.add_worksheet()
part1mean = [part1]
part2mean = [part2]
part3mean = [part3]
outSheet.write("A1", "ps-part1")
outSheet.write("B1", "ps-part2")
outSheet.write("C1", "ps-part3")
outSheet.write(1, 0, part1mean[0])
outSheet.write(1, 1, part2mean[0])
outSheet.write(1, 2, part3mean[0])
outWorkbook.close()
The code reads and separates 'Part 1', 'Part 2', 'Part 3' from the first column, participant and outputs the average of each 'Part', for corresponding values in the second column reaction time, to another file("maniout.xlsx") under 3 columns named "ps-part1", "ps-part2", "ps-part3". I need to automate this for many similar files and have their result added/appended to "maniout.xlsx" under "ps-part1", "ps-part2", "ps-part3". Many Thanks
Upvotes: 0
Views: 56
Reputation: 781
I am assuming your all input files are in a single folder and all having the same type of data format( 2 columns with name as 'Name.2', 'Time.1')
So, in this case, we will loop through all the file and find mean of each data as per your code and add to list after finding all mean will write it xlsx file via pandas dataframe
import pandas as pd
mean_data = []
for root, _, files in os.walk("Folder path"): <-- update folder path
for file in files:
if file.endswith(".xlsx"):
rf = pd.read_excel(os.path.join(root, file), skiprows=5)
rf = rf[['Name.2', 'Time.1']]
rf = rf.rename(columns={'Name.2': 'participant', 'Time.1': 'reaction time'}, inplace=False)
part1 = rf[rf['participant'].str.contains("PART 1")]['reaction time'].mean()
part2 = rf[rf['participant'].str.contains("PART 2")]['reaction time'].mean()
part3 = rf[rf['participant'].str.contains("PART 3")]['reaction time'].mean()
mean_data.append([part1, part2, part3])
df = pd.DataFrame(mean_data, columns=['ps-part1', 'ps-part2', 'ps-part3'])
df.to_excel("folder path with output.xlsx", index=False) <-- update output file path
Upvotes: 1