Reputation: 101
Given is a variable that contains a windows file path. I have to then go and read this file. The problem here is that the path contains escape characters, and I can't seem to get rid of it. I checked os.path and pathlib, but all expect the correct text formatting already, which I can't seem to construct.
For example this. Please note that fPath is given, so I cant prefix it with r for a rawpath.
#this is given, I cant rawpath it with r
fPath = "P:\python\t\temp.txt"
file = open(fPath, "r")
for line in file:
print (line)
How can I turn fPath via some function or method from:
"P:\python\t\temp.txt"
to
"P:/python/t/temp.txt"
I've tried also tried .replace("\","/"), which doesnt work.
I'm using Python 3.7 for this.
Upvotes: 6
Views: 35195
Reputation: 99
The topic is not that easy as it looks to be on first view. Important questions are "where does the input-string comes from?" and "where and how it will be used?".
Let try to understand the "real" value of fPath from topic-description:
fPath = "P:\python\t\temp.txt"
The 1st "\" being followed by "p" symbol. There is no escape-sequence combination "\p", so the python interprets it "as is" (i.e. we are seeing the string exactly how the python interprets it). To reach it - very probably "internally" the 1st "\" symbol will be automatically escaped (i.e. in memory it will be represented as '\\')
The 2nd and 3rd "\" being followed by "t" symbols. There is such symbols escape-sequence meaning tabulator. If you would print the fPath - you will see 2 tabs between "P:\python" and "emp.txt", i.e.:
P:\python emp.txt
This is how the python interprets the given string-value.
Coming back to questions named above. If the string-value comes from another python script - possibly it is expected to have the tabs at those places? If so - then they can't be replaced with "/t" because it would be a modification of original path. But if they are originally coming from user, who just has copy/pasted a windows-specific representation of path - then they should be replaced inside your script to match the user expectation.
For Linux the tabs are allowed to be a part of path, while for Windows it's not the case. This relates to the second important question named above. If you are getting some files from Linux containing tabs in the file-name - you can't use them in Windows. In this case you either should reject the processing of affected files, or (depending on your requirements) replace the tab-symbols with - for example - 4 space-symbols and process the files with changing their names correspondingly.
The complete solution would depend on answers to those questions!
You might need to check:
Assuming your task relates only to Windows, so you are getting a Windows-specific path from somewhere in a Windows-native representation (i.e. the "\t" is not expected to be an escape-sequence) and you should convert it to a string, which could be used as file-path by python.
The simplest solution would be to request user to duplicate the "\" symbols (i.e. to use "\\" instead). However how can we trust that user follows the instruction? ;-)
By the way - how does the user making input in your case? I've tried 2 direct ways with python 3.8.7 under Windows:
In both cases the input string like "C:\my\test\here" was forwarded to script as "C:\\my\\test\\here"; so the "\" symbols were automatic escaped. If you are reading the input-strings from a file - probably it will be easier to read it in binary mode, replace the "\" symbols as required and first then convert the result to strings.
If you are still looking for a "quick direct solution" - the way proposed by Mainak Deb could be a step in right direction. As next step I would use the os.path.normpath(), finally you could replace the '\\' by '/'.
So the entire construct with function resolve_path() proposed by Mainak Deb would be:
nPath = os.path.normpath(resolve_path(fPath)).replace('\\','/')
I'm not sure whether additionally in the function from Mainak Deb also the replacing of single quote escape-sequence should be added (single quote is a valid symbol in Windows path, so the python interpretation of "\'" as a single symbol might be not correct according to current task). Additionally I would remove the "\000" from the function because this construct means character with octal value, so either all possible values should be matched, or it doesn't make sense to match only one. Also I would move the replacement of '\\' by '/' to the front because (if reviewing the function independent from paths task) - the sequence '\\t' is expected to be converted to '/t' and not to '\/t', which is not the same.
Upvotes: 0
Reputation: 11
Use below function, this will pass most of the cases
def resolve_path(path):
parent_replace=['\t','\n','\r','\f','\v','\a','\b','\000','\\']
child_replace=['/t','/n','/r','/f','/v','/a','/b','/000','/']
for i in range(len(parent_replace)):
path=path.replace(parent_replace[i],child_replace[i])
return path
Upvotes: 1
Reputation: 11
I came across similar problem with Windows file paths. This is what is working for me:
import os
file = input(str().split('\\')
file = '/'.join(file)
This gave me the input from this:
"D:\test.txt"
to this:
"D:/test.txt"
Basically when trying to work with the Windows path, python tends to replace '' to '\'. It goes for every backslash. When working with filepaths, you won't have double slashes since those are splitting folder names. This way you can list all folders by order by splitting '\' and then rejoining them by .join function with frontslash.
Hopefully this helps!
Upvotes: 1
Reputation: 382
When using python version >= 3.4, the class Path
from module pathlib
offers a function called as_posix
, which will sort of convert a path to *nix style path. For example, if you were to build Path
object via p = pathlib.Path('C:\\Windows\\SysWOW64\\regedit.exe')
, asking it for p.as_posix()
it would yield C:/Windows/SysWOW64/regedit.exe
. So to obtain a complete *nix style path, you'd need to convert the drive letter manually.
Upvotes: 1
Reputation: 2981
You can use os.path.abspath()
to convert it:
print(os.path.abspath("P:\python\t\temp.txt"))
>>> P:/python/t/temp.txt
See the documentation of os.path here.
Upvotes: 9
Reputation: 101
I've solved it.
The issues lies with the python interpreter. \t and all the others don't exist as such data, but are interpretations of nonprint characters.
So I got a bit lucky and someone else already faced the same problem and solved it with a hard brute-force method:
http://code.activestate.com/recipes/65211/
I just had to find it.
After that I have a raw string without escaped characters, and just need to run the simple replace() on it to get a workable path.
Upvotes: 4
Reputation: 2576
You can use Path function from pathlib library.
from pathlib import Path
docs_folder = Path("some_folder/some_folder/")
text_file = docs_folder / "some_file.txt"
f = open(text_file)
Upvotes: 3