zheyuanWang
zheyuanWang

Reputation: 1374

os.path.basename(file) vs file.split("/")[-1]

I need to extract seq_00034 from a file path like

    file = "/home/user/workspace/data/seq_00034.pkl"

I know 2 ways to achieve it:

method.A

    import os
    seq_name = os.path.basename(file).split(".")[0]

or

method.B

    seq_name = file.split("/")[-1].split(".")[0]

Which is safer/faster?

(taking the cost of import os into account)

Is there a more elegent way to extract seq_name from given path?

Upvotes: 1

Views: 275

Answers (2)

zheyuanWang
zheyuanWang

Reputation: 1374

It turns out split twice(i.e. Method B) is faster than os.path + split.

They are both significantly faster than using pathlib

speed test:

import os
import pathlib
import time

given_path = "/home/home/user/workspace/data/task_2022_02_xx_xx_xx_xx.pkl"

time1 = time.time()
for _ in range(10000):
    seq_name = given_path.split("/")[-1].split(".")[0]
print(time.time()-time1, 'time of split')


time2 = time.time()
for _ in range(10000):
    seq_name = pathlib.Path(given_path).stem
print(time.time()-time2, 'time of pathlib')


time3 = time.time()
for _ in range(10000):
    seq_name = os.path.basename(given_path).split(".")[0]
print(time.time()-time3, 'time of os.path')

result (on my PC) is:

0.00339508056640625 time of split
0.0355381965637207 time of pathlib
0.005405426025390625 time of os.path

if we take the time consumed for importing into account, split twice (i.e. Method B) is still the fastest

(assume the code is only called once)

time1 = time.time()
seq_name = given_path.split("/")[-1].split(".")[0]
print(time.time()-time1, 'time of split')

time2 = time.time()
import pathlib
seq_name = pathlib.Path(given_path).stem
print(time.time()-time2, 'time of pathlib')

time3 = time.time()
import os
seq_name = os.path.basename(given_path).split(".")[0]
print(time.time()-time3, 'time of os.path')

speed test result:

0.000001430511474609375 time of split
0.003416776657104492 time of pathlib
0.0000030994415283203125 time of os.path

Upvotes: 2

EvensF
EvensF

Reputation: 1610

I think the more elegant way would be by using the pathlib.Path.stem() method

import pathlib

filename =  "/home/user/workspace/data/seq_00034.pkl"
path = pathlib.Path(filename)

print(path.stem)

Upvotes: 3

Related Questions