D Chase
D Chase

Reputation: 129

Python/Pandas Walk through directory and save all foldernames subfolder and file to excel

I want to save all directory info. (path, folder, subfolder, and files) to an excel spreadsheet using Pandas.

Here is my code so far:

import os
import pandas as pd


# setup the paths
root_path = os.path.join(os.path.expanduser("~"), 'Desktop/')
test_path = os.path.join(root_path, 'Test Dir')

# setup excelwriter
# Input writer
xlWriterOutput = pd.ExcelWriter(os.path.join(test_path,'read_directory_to_excel.xlsx'), engine='xlsxwriter')


files_list = []
dfFiles = pd.DataFrame

directory_path = os.path.join(root_path, test_path)

if not os.path.exists(directory_path):
    message = "Failed to find directory '%s'." % path
    if errors is not None:
        errors.append(message)
    else:
        raise IOError(message)
else:
    for path, dirs, files in os.walk(test_path):
        for file in files:
            files_list.append(os.path.join(path,file))
            dfFiles['path'] = path
            dfFiles['directory'] = dirs
            dfFiles['file_name'] = file

#Write the directory walk out to excel
dfFiles.to_excel(xlWriterOutput, header=True, sheet_name='Directory Output', index=False)

I started out with a list but started moving my solution to Pandas and ExcelWriter. I get an error "Type Error: 'type' object does not support item assignment" on the line where i am attempt to set dfFiles['path'] = path. Need some help at this point.

Upvotes: 0

Views: 718

Answers (1)

Nk03
Nk03

Reputation: 14949

you can use pathlib module:

from pathlib import Path

inp_path = Path('.') # specify the path here
df = pd.DataFrame([{'parent': f.absolute().parent, 'full_path': f.absolute(), 'relative_path': f,
               'file_name_without_extension': f.stem, 'file_name_with_extension': f.name} for f in inp_path.glob('**/*')])

df.to_excel('specify the excel sheet path here.xsls', index = False)

Here:

  1. parent will give the parent directory info.
  2. absolute will give the absolute path
  3. stem will give the file name without extension
  4. name will give the name of the file.

NOTE: If you want only file information you can add an if condition in list comprehension : if f.is_file().

Upvotes: 3

Related Questions