Reputation: 145
So I've managed to write a code in pandas to do the data analysis I need and export to a new .xlsx file. Which is awesome, except it's for one file and I typically have 40+ files I want to run this on.
Through research I managed to at least get it to read the file names in the folder, but I am at a loss on how to implement into my existing code.
Goal: To run code over each .xlsx file in folder and spit out the analyzed data as new .xlsx files.
For now here is the code I came up with to read the folder:
import os
import glob
os.chdir('C:/Users/PCTR261010/Desktop/PartReviewExport')
FileList = glob.glob('*.xlsx')
print(FileList)
Here is a snippet of the import section of my larger code file:
import os
import glob
import pandas as pd
# Prints header information in Part Scorecard
df = pd.read_excel('GAT_US_PartReview_2017-06-23.xlsx', header=None,
skipinitialspace=True, skiprows=1)
header = df.head(5).filter([0,2], axis=1)
# Begins Data Analysis of Part Scorecard
fields = ['Appl Req', 'Appl Count ', 'Intr Req', 'Intr Count ', 'OE Intr
Req', 'Has OE Intr', 'Has Attr Editor',
'Part IMG Req', 'Has Part IMG', 'Has MPCC', 'Warr Req', 'Has Warr
TXT', 'Has Warr PDF', 'MSDS Req',
'Has MSDS', 'UPC Req', 'Has UPC', 'Has UNSPSC', 'Valid Part']
df = pd.read_excel('GAT_US_PartReview_2017-06-23.xlsx',
skipinitialspace=True, skiprows=7, usecols=fields,
dtype=str)
Any help is appreciated!!
Upvotes: 0
Views: 224
Reputation: 145
OMG! When you spend all day staring at this and finally realize it's an indention issue. FML. Thanks guys!
Upvotes: 0
Reputation: 36598
You can iterate over each of the file names, passing them to pandas
import os
import glob
import pandas as pd
os.chdir('C:/Users/PCTR261010/Desktop/PartReviewExport')
FileList = glob.glob('*.xlsx')
print(FileList)
for fname in FileList:
# Prints header information in Part Scorecard
df = pd.read_excel(fname, header=None,
skipinitialspace=True, skiprows=1)
header = df.head(5).filter([0,2], axis=1)
# Begins Data Analysis of Part Scorecard
fields = ['Appl Req', 'Appl Count ', 'Intr Req', 'Intr Count ', 'OE Intr
Req', 'Has OE Intr', 'Has Attr Editor',
'Part IMG Req', 'Has Part IMG', 'Has MPCC', 'Warr Req', 'Has Warr
TXT', 'Has Warr PDF', 'MSDS Req',
'Has MSDS', 'UPC Req', 'Has UPC', 'Has UNSPSC', 'Valid Part']
df = pd.read_excel(fname, skipinitialspace=True,
skiprows=7, usecols=fields, dtype=str)
# ... analysis here ...
df.to_excel('out_' + fname)
Upvotes: 1