RedBoxes
RedBoxes

Reputation: 101

Python Convert Windows File path in a variable

Given is a variable that contains a windows file path. I have to then go and read this file. The problem here is that the path contains escape characters, and I can't seem to get rid of it. I checked os.path and pathlib, but all expect the correct text formatting already, which I can't seem to construct.

For example this. Please note that fPath is given, so I cant prefix it with r for a rawpath.

#this is given, I cant rawpath it with r 
fPath = "P:\python\t\temp.txt"

file = open(fPath, "r")
for line in file:
    print (line)

How can I turn fPath via some function or method from:

"P:\python\t\temp.txt"

to

"P:/python/t/temp.txt"

I've tried also tried .replace("\","/"), which doesnt work.

I'm using Python 3.7 for this.

Upvotes: 6

Views: 35195

Answers (8)

Dr.CKYHC
Dr.CKYHC

Reputation: 99

The topic is not that easy as it looks to be on first view. Important questions are "where does the input-string comes from?" and "where and how it will be used?".

Let try to understand the "real" value of fPath from topic-description:

fPath = "P:\python\t\temp.txt"

The 1st "\" being followed by "p" symbol. There is no escape-sequence combination "\p", so the python interprets it "as is" (i.e. we are seeing the string exactly how the python interprets it). To reach it - very probably "internally" the 1st "\" symbol will be automatically escaped (i.e. in memory it will be represented as '\\')

The 2nd and 3rd "\" being followed by "t" symbols. There is such symbols escape-sequence meaning tabulator. If you would print the fPath - you will see 2 tabs between "P:\python" and "emp.txt", i.e.:

P:\python       emp.txt

This is how the python interprets the given string-value.

Coming back to questions named above. If the string-value comes from another python script - possibly it is expected to have the tabs at those places? If so - then they can't be replaced with "/t" because it would be a modification of original path. But if they are originally coming from user, who just has copy/pasted a windows-specific representation of path - then they should be replaced inside your script to match the user expectation.

For Linux the tabs are allowed to be a part of path, while for Windows it's not the case. This relates to the second important question named above. If you are getting some files from Linux containing tabs in the file-name - you can't use them in Windows. In this case you either should reject the processing of affected files, or (depending on your requirements) replace the tab-symbols with - for example - 4 space-symbols and process the files with changing their names correspondingly.

The complete solution would depend on answers to those questions!

You might need to check:

  • Does the given path relates to Windows?
  • Is the script running on Windows?
  • Etc.

Assuming your task relates only to Windows, so you are getting a Windows-specific path from somewhere in a Windows-native representation (i.e. the "\t" is not expected to be an escape-sequence) and you should convert it to a string, which could be used as file-path by python.

The simplest solution would be to request user to duplicate the "\" symbols (i.e. to use "\\" instead). However how can we trust that user follows the instruction? ;-)

By the way - how does the user making input in your case? I've tried 2 direct ways with python 3.8.7 under Windows:

  • Taking a command-line argument via sys.argv[1]
  • Getting user-input via call of input('Give me some text:')

In both cases the input string like "C:\my\test\here" was forwarded to script as "C:\\my\\test\\here"; so the "\" symbols were automatic escaped. If you are reading the input-strings from a file - probably it will be easier to read it in binary mode, replace the "\" symbols as required and first then convert the result to strings.

If you are still looking for a "quick direct solution" - the way proposed by Mainak Deb could be a step in right direction. As next step I would use the os.path.normpath(), finally you could replace the '\\' by '/'.

So the entire construct with function resolve_path() proposed by Mainak Deb would be:

nPath = os.path.normpath(resolve_path(fPath)).replace('\\','/')

I'm not sure whether additionally in the function from Mainak Deb also the replacing of single quote escape-sequence should be added (single quote is a valid symbol in Windows path, so the python interpretation of "\'" as a single symbol might be not correct according to current task). Additionally I would remove the "\000" from the function because this construct means character with octal value, so either all possible values should be matched, or it doesn't make sense to match only one. Also I would move the replacement of '\\' by '/' to the front because (if reviewing the function independent from paths task) - the sequence '\\t' is expected to be converted to '/t' and not to '\/t', which is not the same.

Upvotes: 0

Mainak Deb
Mainak Deb

Reputation: 11

Use below function, this will pass most of the cases

def resolve_path(path):
   parent_replace=['\t','\n','\r','\f','\v','\a','\b','\000','\\']
   child_replace=['/t','/n','/r','/f','/v','/a','/b','/000','/']
   for i in range(len(parent_replace)):
      path=path.replace(parent_replace[i],child_replace[i])
   return path

Upvotes: 1

Isoide
Isoide

Reputation: 11

I came across similar problem with Windows file paths. This is what is working for me:

    import os
    file = input(str().split('\\')
    file = '/'.join(file)

This gave me the input from this:

    "D:\test.txt"

to this:

    "D:/test.txt"

Basically when trying to work with the Windows path, python tends to replace '' to '\'. It goes for every backslash. When working with filepaths, you won't have double slashes since those are splitting folder names. This way you can list all folders by order by splitting '\' and then rejoining them by .join function with frontslash.

Hopefully this helps!

Upvotes: 1

blurryroots
blurryroots

Reputation: 382

When using python version >= 3.4, the class Path from module pathlib offers a function called as_posix, which will sort of convert a path to *nix style path. For example, if you were to build Path object via p = pathlib.Path('C:\\Windows\\SysWOW64\\regedit.exe'), asking it for p.as_posix() it would yield C:/Windows/SysWOW64/regedit.exe. So to obtain a complete *nix style path, you'd need to convert the drive letter manually.

Upvotes: 1

Nordle
Nordle

Reputation: 2981

You can use os.path.abspath() to convert it:

print(os.path.abspath("P:\python\t\temp.txt"))

>>> P:/python/t/temp.txt

See the documentation of os.path here.

Upvotes: 9

RedBoxes
RedBoxes

Reputation: 101

I've solved it.

The issues lies with the python interpreter. \t and all the others don't exist as such data, but are interpretations of nonprint characters.

So I got a bit lucky and someone else already faced the same problem and solved it with a hard brute-force method:

http://code.activestate.com/recipes/65211/

I just had to find it.

After that I have a raw string without escaped characters, and just need to run the simple replace() on it to get a workable path.

Upvotes: 4

Adriano Silva
Adriano Silva

Reputation: 2576

You can use Path function from pathlib library.

from pathlib import Path

docs_folder = Path("some_folder/some_folder/")
text_file = docs_folder / "some_file.txt"
f = open(text_file)

Upvotes: 3

tbalaz
tbalaz

Reputation: 159

if you would like to do replace then do

replace("\\","/")

Upvotes: 1

Related Questions