Reputation: 2350
I have got the complete path of files in a list like this:
a = ['home/robert/Documents/Workspace/datafile.xlsx', 'home/robert/Documents/Workspace/datafile2.xls', 'home/robert/Documents/Workspace/datafile3.xlsx']
what I want is to get just the file NAMES without their extensions, like:
b = ['datafile', 'datafile2', 'datafile3']
What I have tried is:
xfn = re.compile(r'(\.xls)+')
for name in a:
fp, fb = os.path.split(fp)
ofn = xfn.sub('', name)
b.append(ofn)
But it results in:
b = ['datafilex', 'datafile2', 'datafile3x']
Upvotes: 12
Views: 19566
Reputation: 1496
This is a repeat of: How to get the filename without the extension from a path in Python?
https://docs.python.org/3/library/os.path.html
In python 3 pathlib "The pathlib module offers high-level path objects." so,
>>> from pathlib import Path
>>> p = Path("/a/b/c.txt")
>>> print(p.with_suffix(''))
\a\b\c
>>> print(p.stem)
c
Upvotes: 1
Reputation: 523184
The regex you've used is wrong. (\.xls)+
matches strings of the form .xls
, .xls.xls
, etc. This is why there is a remaining x
in the .xlsx
items. What you want is \.xls.*
, i.e. a .xls
followed by zero or more of any characters.
You don't really need to use regex. There are specialized methods in os.path that deals with this: basename and splitext.
>>> import os.path
>>> os.path.basename('home/robert/Documents/Workspace/datafile.xlsx')
'datafile.xlsx'
>>> os.path.splitext(os.path.basename('home/robert/Documents/Workspace/datafile.xlsx'))[0]
'datafile'
so, assuming you don't really care about the .xls
/.xlsx
suffix, your code can be as simple as:
>>> a = ['home/robert/Documents/Workspace/datafile.xlsx', 'home/robert/Documents/Workspace/datafile2.xls', 'home/robert/Documents/Workspace/datafile3.xlsx']
>>> [os.path.splitext(os.path.basename(fn))[0] for fn in a]
['datafile', 'datafile2', 'datafile3']
(also note the list comprehension.)
Upvotes: 27
Reputation: 32949
Why not just use the split
method?
def get_filename(path):
""" Gets a filename (without extension) from a provided path """
filename = path.split('/')[-1].split('.')[0]
return filename
>>> path = '/home/robert/Documents/Workspace/datafile.xlsx'
>>> filename = get_filename(path)
>>> filename
'datafile'
Upvotes: 0
Reputation: 77251
Oneliner:
>>> filename = 'file.ext'
>>> '.'.join(filename.split('.')[:-1]) if '.' in filename else filename
'file'
Upvotes: 4