Aizzaac
Aizzaac

Reputation: 3318

How to get the filename without the path from a path

I have a list of "pickle" files (see Image1). I want to use the name of the file as an index in Pandas. But so far I have all the path (which is long) + the file's name.

I have found this link: How to get the filename without the extension from a path in Python?

The answer is using ".stem" somewhere in my code. But I just do not know where. and my files do not have an extension.

import pandas as pd
import glob
from pathlib import Path


# This is the path to the folder which contains all the "pickle" files
dir_path = Path(r'C:\Users\OneDrive\Projects\II\Coral\Classification\inference_time')
files = dir_path.glob('**/file_inference_time*')  


df_list = list()  #This is an empty list

for file in files:
    df = pd.DataFrame(pd.read_pickle(file)) #storing the "pickle" files in a dataframe


    df_list['file'] = file  #creating a column 'file' which has the path + file

    df_list.append(df)  #sending all dataframes into a list


df_list_all = pd.concat(df_list).reset_index(drop=True) #merging all dataframes into a single one

df_list_all

THIS IS WHAT I GET:

    Inference_Time  file
0   2.86    C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_InceptionV1
1   30.96   C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_mobileNetV2
2   11.04   C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_efficientNet

I WANT THIS:

             Inference_Time        file
InceptionV1    2.86  C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_InceptionV1
mobilenetV2    30.96    C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_mobileNetV2
efficientNet   11.04    C:\Users\OneDrive\Projects\Classification\inference_time\inference_time_efficientNet

IMAGE 1

enter image description here

Upvotes: 0

Views: 899

Answers (2)

hume
hume

Reputation: 2553

Check out pandas-path which gives you a .path accessor on Series that exposes all of the normal pathlib methods and properties.

import pandas as pd
from pandas_path import path

# can be windows paths; only posix paths because i am on posix machine
data = [
    ("folder/inference_time_InceptionV1", 10),
    ("folder2/inference_time_mobileNetV2", 20),
    ("folder4/inference_time_efficientNet", 30),
]

df = pd.DataFrame(data, columns=['file', 'time'])
(
    df.file.path.name  # use path accessor from pandas_path to get just the filename
     .str.split('_')   # split into components based on "_" 
     .str[-1]          # select last component
)
#> 0     InceptionV1
#> 1     mobileNetV2
#> 2    efficientNet
#> Name: file, dtype: object

Created at 2021-03-06 10:57:59 PST by reprexlite v0.4.2

Upvotes: 1

Mayank Porwal
Mayank Porwal

Reputation: 34046

You can transform your output to this:

In [1603]: df                                                                                                                                                                                               
Out[1603]: 
   Inference_Time                                               file
0            2.86  C:\Users\OneDrive\Projects\Classification\infe...
1           30.96  C:\Users\OneDrive\Projects\Classification\infe...
2           11.04  C:\Users\OneDrive\Projects\Classification\infe...

In [1607]: df = df.set_index(df['file'].str.split('inference_time_').str[-1])   

In [1610]: del df.index.name

In [1608]: df                                                                                                                                                                                               
Out[1608]: 
              Inference_Time                                               file

InceptionV1             2.86  C:\Users\OneDrive\Projects\Classification\infe...
mobileNetV2            30.96  C:\Users\OneDrive\Projects\Classification\infe...
efficientNet           11.04  C:\Users\OneDrive\Projects\Classification\infe...

Upvotes: 1

Related Questions