pHorseSpec
pHorseSpec

Reputation: 1274

Extract Directory Names from Dynamic File Paths in Python

Is there a way in Python to extract each file directory and the file in a windows file path either through the use of REGEX and group() or os.path?

I'm dealing with the file paths that have varying amount of directories, so one line could be D:\dir1\file.txt while the next could be Z:\dir1\dir2\dir3\dir4\dir5\file.txt, so I wonder if there is even a way to do this with REGEX or a built in function in Python when there are varying amounts of \ in the text I'm searching.

Any insight would be helpful, even if it's just the bitter truth that it can't be done.

After Edit:

I'm trying to extract the directory names in between the \ and the final file.txt and write each dir or file to it's own column in an output text file.

My desired output for the above two lines would be:

 col1|col2|col3|col4|col5|col6
 dir1|dir2|dir3|dir4|dir5|file.txt
 dir1|    |    |    |    |file.txt

I know os.path has a lot of good built in functions, but after reading this site: https://docs.python.org/2/library/os.path.html, I don't think any of them are doing what I'm trying to do.

Upvotes: 0

Views: 1161

Answers (1)

Jacques de Hooge
Jacques de Hooge

Reputation: 6990

You can separate fileName and directory by using:

splitFilePath = filePath.rsplit (']\', 1)
directory = splitFilePath [0]
fileName = splitFilePath [1]

You can get all chunks separated by '\' by using:

chunks = filePath.split (r'\')

You can then take out particular chunks by using slicing and glue subsets of them together using join.

Using the columns as you added in your edited question assumes you know your longest path to determine the number of columns:

  • Split using split function as explained above
  • Find length of longest list
  • Insert empty strings in all lists but the longest before the last element to make the lists equal length
  • Join them using '|' using join function

In response to your comment:

Running the following program

filePath = r'E:\dir1\Logs\dir2\1998-12-23\message.txt'
splitFilePath = filePath.rsplit ('\\', 1)
directory = splitFilePath [0]
fileName = splitFilePath [1]
print directory
print fileName

gives as output

E:\dir1\Logs\dir2\1998-12-23
message.txt

So '\\' rather than '\' in the rsplit.

Upvotes: 2

Related Questions