Reputation: 89
I am pretty weak in regex. I'm looking for some help with how to extract the .sav
file name from the following string:
C:\Users...\Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20\AutumnHi-20180531-183047-34-SystemNormal\AutumnHi-20180531-183047-34-SystemNormal.sav
Currently I am using this code:
re.findall(r'\\(.+).sav',txt)
but it only finds
['Users\\...\\Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20\\AutumnHi-20180531-183047-34-SystemNormal\AutumnHi-20180531-183047-34-SystemNormal.sav was']
I'm trying to find "AutumnHi-20180531-183047-34-SystemNormal.sav"
I am using Python 3.7.
Upvotes: 1
Views: 183
Reputation: 27723
I'm guessing that these expressions:
[^\\]+\.sav
([^\\]+\.sav)
or some similar derivative of those might likely extract what we might want here.
import re
print(re.findall(r"([^\\]+\.sav)", "C:\\Users...\\Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20\\AutumnHi-20180531-183047-34-SystemNormal\\AutumnHi-20180531-183047-34-SystemNormal.sav"))
['AutumnHi-20180531-183047-34-SystemNormal.sav']
Upvotes: 0
Reputation: 819
I am assuming you are not learning about regex but want to know how to handle parsing filenames.
I would use the pathlib module to handle parsing the filename.
C:\Users\barry>py -3.7
Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pathlib
>>> filename = r'C:\Users\...\Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20\WinterLo-20180729-043047-34-SystemNormal\WinterLo-20180729-043047-34-SystemNormal.sav'
>>> path = pathlib.Path(filename)
>>> path.name
'WinterLo-20180729-043047-34-SystemNormal.sav'
>>> path.parent
WindowsPath('C:/Users/.../Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20/WinterLo-20180729-043047-34-SystemNormal')
>>>
Upvotes: 0
Reputation: 163352
You could match a backslash and then capture in a group matching not a backslash using a negated character class. Then match a dot and sav.
You might use a negative lookahead to assert what is directly on the right is not a non whitespace char.
\\([^\\]+\.sav)(?!\S)
Upvotes: 1
Reputation: 945
The following pattern should match the filename:
(?=[^\\]*$).*\.sav
The above pattern asserts (?=
is positive lookahead) that no other character up to the end of the string is a backslash. So essentially it finds the last backslash and then matches the desired text. For other details, see "EXPLANATION" on the right side of the regex101 demo at the link above.
Upvotes: 0
Reputation: 195438
Regex101 (link):
txt = r'''C:\Users\\...\\Standard Loadflows Seq and Dyn PSSEv34 - 2019-02-20\\WinterLo-20180729-043047-34-SystemNormal\\WinterLo-20180729-043047-34-SystemNormal.sav'''
import re
print(re.findall(r'(?<=\\)[^\\]+sav',txt)[0])
Prints:
WinterLo-20180729-043047-34-SystemNormal.sav
You could achieve the same without re
module:
print(txt.split('\\')[-1])
Upvotes: 0