Reputation: 1568
I am trying to use l and rsplit to remove the subdirectories from this dataframe and preserve just the file name in the dataframe's column.
import pandas as pd
data = ['D:/xyz/abc/123/file_1.txt', 'D:/xyz/abc/file2.txt', 'D:/xyz/file_2.txt']
data = pd.DataFrame(data)
data[0].str.rsplit('/').str[3]
Returns:
Out[1]:
0 123
1 file2.txt
2 NaN
Name: 0, dtype: object
As you can see, this does not preserve just the txt file names regardless of the str[]
function.
Desired output:
Out[1]:
0 file_1.txt
1 file2.txt
2 file_2.txt
Name: 0, dtype: object
Any insight would be appreciated. Thanks.
Upvotes: 0
Views: 35
Reputation: 2034
Can use os.path.split
to get the last section of the path
https://docs.python.org/3.3/library/os.path.html?highlight=path#os.path.split
import os
f = lambda x: os.path.split(x)[1]
data[0] = data[0].map(f)
Upvotes: 0
Reputation: 25249
Try rsplit
with limit 1 and pick last item
data[0].str.rsplit('/', n=1).str[-1]
Out[194]:
0 file_1.txt
1 file2.txt
2 file_2.txt
Name: 0, dtype: object
Upvotes: 2