Reputation: 432
I have a path in a variable like that:
path = "C:\HT_Projeler\7\Kaynak\wrapped_gedizw.tif"
Which is incorrect because it contains escape sequences:
>>> path
'C:\\HT_Projeler\x07\\Kaynak\\wrapped_gedizw.tif'
How can I fix the path in this variable so it becomes equivalent to r"C:\HT_Projeler\7\Kaynak\wrapped_gedizw.tif"
or "C:/HT_Projeler/7/Kaynak/wrapped_gedizw.tif"
?
I know the topic is common and I investigated many questions (1,2 etc.) in here.
ADD
Here is my exact script:
...
basinFile = self._gv.basinFile
basinDs = gdal.Open(basinFile, gdal.GA_ReadOnly)
basinNumberRows = basinDs.RasterYSize
basinNumberCols = basinDs.RasterXSize
...
In here self._gv.basinFile
consists my path. So I cannot put "r" beginngin of self._gv.basinFile
Upvotes: 0
Views: 108
Reputation: 189397
In the general case, there is no way to tell whether a character in a path is correct or not without externally checking the actual paths on your computer (and "special character" is not really well-defined; how do you know that the path wasn't \0x41
which got converted to A
anyway?)
As a weak heuristic, you could look for path names within a particular editing distance, for example.
import os
from difflib import SequenceMatcher as similarity # or whatever
path_components = os.path.split(variable)
path = ''
for p in path_components:
npath = os.path.join(path, p)
if not os.path.exists(npath):
similar = reversed(sorted([(similarity(None, x, p).ratio(), x) in os.listdir(npath)]))
# recurse on most similar, second most similar, etc? or something
path = npath
Upvotes: 1
Reputation: 126787
If you insert paths in Python code, just use raw strings, as other have suggested.
If instead that string is out of your control, there's not much you can do "after the fact". Escape sequences conversion is not injective, so, given a string where escape sequences have already been processed, you cannot "go back" univocally. IOW, if someone incorrectly writes:
path = "C:\HT_Projeler\7\Kaynak\wrapped_gedizw.tif"
as you show, you get
'C:\\HT_Projeler\x07\\Kaynak\\wrapped_gedizw.tif'
and there's no way to guess surely "what they meant", because that \x07
may have been written as \7
, or \x07
, or \a
. Heck, any letter may have been originally written as an escape sequence - what you see in that string as an a
may have actually been \x61
.
Long story short: your caller is responsible for giving you correct data. Once it's corrupted there's no way to come back.
Upvotes: 5