Reputation: 1895
I wan to split file name from the given file using regex function re.split.Please find the details below:
SVC_DC = 'JHN097567898_01102019_050514_svc_dc.tar"
My solution:
import regex as re
ans=re.split(os.sep,SVC_DC)
Error: re.error: bad escape (end of pattern) at position 0
Thanks in advance
Upvotes: 0
Views: 1652
Reputation: 30971
The reason of your failure are details concerning regular expressions, namely the quotation issue.
E.g. under Windows os.sep
= '\\'
, i.e. a single backslash.
But the backslash in regex has special meaning, just to escape special characters, so in order to use it literally, you have to write it twice.
Try the following code:
import re
import os
SVC_DC = 'JHN097567898_01102019_050514_svc_dc.tar'
print(re.split(os.sep * 2, SVC_DC))
The result is:
['JHN097567898_01102019_050514_svc_dc.tar']
As the source string does not contain any backslashes, the result is a list containing only one item (the whole source string).
To make the regex working under both Windows and Unix, you can try:
print(re.split('\\' + os.sep, SVC_DC))
Note that this regex contains:
Note that the forward slash (in Unix case) does not require quotation, but using quotation here is still acceptable (not needed, but working).
Upvotes: 1
Reputation: 110271
If you want a filename, regexes are not your answer.
Python has the pathlib module dedicated to handling filepaths, and its objects, besides having methods to get the isolated filename handlign all possible corner-cases, also have methods to open, list files, and do everything one normally does to a file.
To get the base filename from a path, just use its automatic properties:
In [1]: import pathlib
In [2]: name = pathlib.Path("/home/user/JHN097567898_01102019_050514_svc_dc.tar")
In [3]: name.name
Out[3]: 'JHN097567898_01102019_050514_svc_dc.tar'
In [4]: name.parent
Out[4]: PosixPath('/home/user')
Otherwise, even if you would not use pathlib
, os.path.sep being a single character, there would be no advantage in using re.split
at all - normal string.split would do. Actually, there is os.path.split
as well, that, predating pathlib, would always do the samething:
In [6]: name = "/home/user/JHN097567898_01102019_050514_svc_dc.tar"
In [7]: import os
In [8]: os.path.split(name)[-1]
Out[8]: 'JHN097567898_01102019_050514_svc_dc.tar'
And last (and in this case, actually least), the reason of the error is that you are on windows, and your os.path.sep
character is "\"
- this character alone is not a full regular expression, as the regex engine expects a character indicating a special sequence to come after the "\". For it to be used withour error, you'd need to do:
re.split(re.escape(os.path.sep), "myfilepath")
Upvotes: 2