Reputation: 129
I want to save all directory info. (path, folder, subfolder, and files) to an excel spreadsheet using Pandas.
Here is my code so far:
import os
import pandas as pd
# setup the paths
root_path = os.path.join(os.path.expanduser("~"), 'Desktop/')
test_path = os.path.join(root_path, 'Test Dir')
# setup excelwriter
# Input writer
xlWriterOutput = pd.ExcelWriter(os.path.join(test_path,'read_directory_to_excel.xlsx'), engine='xlsxwriter')
files_list = []
dfFiles = pd.DataFrame
directory_path = os.path.join(root_path, test_path)
if not os.path.exists(directory_path):
message = "Failed to find directory '%s'." % path
if errors is not None:
errors.append(message)
else:
raise IOError(message)
else:
for path, dirs, files in os.walk(test_path):
for file in files:
files_list.append(os.path.join(path,file))
dfFiles['path'] = path
dfFiles['directory'] = dirs
dfFiles['file_name'] = file
#Write the directory walk out to excel
dfFiles.to_excel(xlWriterOutput, header=True, sheet_name='Directory Output', index=False)
I started out with a list but started moving my solution to Pandas and ExcelWriter. I get an error "Type Error: 'type' object does not support item assignment" on the line where i am attempt to set dfFiles['path'] = path
. Need some help at this point.
Upvotes: 0
Views: 718
Reputation: 14949
you can use pathlib module
:
from pathlib import Path
inp_path = Path('.') # specify the path here
df = pd.DataFrame([{'parent': f.absolute().parent, 'full_path': f.absolute(), 'relative_path': f,
'file_name_without_extension': f.stem, 'file_name_with_extension': f.name} for f in inp_path.glob('**/*')])
df.to_excel('specify the excel sheet path here.xsls', index = False)
Here:
parent
will give the parent directory info.absolute
will give the absolute pathstem
will give the file name without extensionname
will give the name of the file.NOTE: If you want only file information you can add an if condition in list comprehension
: if f.is_file()
.
Upvotes: 3