Reputation: 2636
I have a filesystem path setup like this where a directory name with spaces is symlinked to a directory with no spaces (to try and circumvent issues such as this):
$ ls -lR
drwxr-xr-x 3 user group 96 Nov 25 15:37 with - spaces/
lrwxr-xr-x 1 user group 18 Nov 25 15:28 with_no_spaces@ -> /tmp/with - spaces
./with - spaces:
total 0
drwxr-xr-x 5 user group 160 Nov 25 15:37 dir1/
./with - spaces/dir1:
total 0
-rw-r--r-- 1 user group 0 Nov 25 15:37 a
-rw-r--r-- 1 user group 0 Nov 25 15:37 b
-rw-r--r-- 1 user group 0 Nov 25 15:37 c
I would like to use the pathlib
module to iterate over the a, b, c
files in dir1
and get the absolute path of each.
$ pwd
/private/tmp/with - spaces/dir1
import pathlib
p = pathlib.Path(".")
file_names = [_file.absolute().as_posix() for _file in p.iterdir()]
print(file_names)
#['/private/tmp/with - spaces/dir1/a',
# '/private/tmp/with - spaces/dir1/c',
# '/private/tmp/with - spaces/dir1/b']
This works as expected but these file paths are then consumed by a downstream system (spark) and causes issues because of the paths with spaces. Is there a platform-agnostic way to handle this with pathlib
or maybe escape characters, or something else?
Using python 3.7.4
Thanks.
Upvotes: 1
Views: 7787
Reputation: 11922
I can't really pin it down since your code is working as it should, and you didn't detail how the down stream works. However, in most platforms, path arguments with spaces in them are usually fine as long as they are in double quotes.
So try this (assuming python 3.6+):
file_names = [f'"{_file.absolute().as_posix()}"' for _file in p.iterdir()]
Or this for all python versions:
file_names = ['"' + _file.absolute().as_posix() + '"' for _file in p.iterdir()]
Upvotes: 3