Reputation: 5
For the purpose of writing my thesis mechanical engineering i recieved a lot of sensor data that is configured in multiple excel files(100) with multiple sheets(22).
Now I want to visualize this into power Bi but the .xlxs
files a working way to slow so i want all the data(sheets) in seperate CSV files.
I don't have any real experience with programming but will be able to run a script in jupyter or spyder.
I tried a code in VBA that configured multiple excels into csv, but this only worked for the first sheet in the .xlsx
file.
I also used the code below in jupyter notebook; but this gives me all the sheets for one single excel.
data = pd.read_excel('file_name.file_format', sheet_name=None)
for sheet_name, df in data.items():
df.to_csv(f'{sheet_name}.csv')
Does anyone have a code for this purpose or does someone know how to adjust the code above to do this with all excel files in a folder?
Upvotes: 0
Views: 331
Reputation: 5
Some files unfortunatly have extra sensors & data which means extra workssheets. But i got this code working now:
import os
import pandas as pd
directory = "./"
files = os.listdir(directory)
for xlxs_file in files:
if ".xlsx" in xlxs_file:
filename = xlxs_file.strip(".xlsx")
xlxs_file = os.path.join(directory, xlxs_file)
data = pd.read_excel(xlxs_file, sheet_name=None)
for sheet_name, df in data.items():
df.to_csv("{}-{}.csv".format(filename, sheet_name))
Upvotes: 0
Reputation: 644
You could try looping through the directory for each file that has .xlxs, just replace YOUR_DIR with your own path to folder containing the files.
I've added "filename" which is just filename without extension so you can add this to .csv filename
import os
directory = "\\YOUR_DIR\\HERE"
files = os.listdir(directory)
for xlxs_file in files:
if ".xlxs" in xlxs_file:
filename = xlxs_file.strip(".xlxs")
xlxs_file = directory + "\\" + xlxs_file
data = pd.read_excel(xlxs_file, sheet_name=None)
for sheet_name, df in data.items():
df.to_csv(f'{filename}_{sheet_name}.csv')
Upvotes: 0
Reputation: 560
So long as the sheetnames are the same in each file then this should work:
import os
import pandas as pd
# target directory where the workbooks lie
tgt_dir = r'paste\directory\here\make\sure\to\keep\letter\r\before\quote'
# list of any files within the dir that have .xlsx in them
list_xl_files = [f for f in os.listdir(tgt_dir) if '.xlsx' in f.lower()]
# type a list of the sheets you want to target and extract
list_target_sheets = ['Sheet1', 'Sheet2', 'etc']
# iterate through each file and for each sheet in target sheets
for xl_file in list_xl_files:
for sheet in list_target_sheets:
# read in the file and target sheet
df = pd.read_excel(tgt_dir+'\\'+xl_file, sheet_name=sheet)
# export to csv but replace .xlsx with nothing
# then add _sheetname.csv so the filename shows the sheet too
df.to_csv(tgt_dir+'\\'+xl_file.replace('.xlsx','')+'_'+sheet_name+'.csv')
Upvotes: 1