Hector
Hector

Reputation: 671

How can I replace a substring in a Python pathlib.Path?

Is there an easy way to replace a substring within a pathlib.Path object in Python? The pathlib module is nicer in many ways than storing a path as a str and using os.path, glob.glob etc, which are built in to pathlib. But I often use files that follow a pattern, and often replace substrings in a path to access other files:

data/demo_img.png
data/demo_img_processed.png
data/demo_spreadsheet.csv

Previously I could do:

img_file_path = "data/demo_img.png"
proc_img_file_path = img_file_path.replace("_img.png", "_img_proc.png")
data_file_path = img_file_path.replace("_img.png", "_spreadsheet.csv")

pathlib can replace the file extension with the with_suffix() method, but only accepts extensions as valid suffixes. The workarounds are:

import pathlib
import os


img_file_path = pathlib.Path("data/demo_img.png")
proc_img_file_path = pathlib.Path(str(img_file_path).replace("_img.png", "_img_proc.png"))
# os.fspath() is available in Python 3.6+ and is apparently safer than str()
data_file_path = pathlib.Path(os.fspath(img_file_path).replace("_img.png", "_img_proc.png"))

Converting to a string to do the replacement and reconverting to a Path object seems laborious. Assume that I never have a copy of the string form of img_file_path, and have to convert the type as needed.

Upvotes: 33

Views: 31879

Answers (5)

ibykovsky
ibykovsky

Reputation: 49

Use PurePath.with_name() or PurePath.with_stem().

Upvotes: 4

Jean-Francois T.
Jean-Francois T.

Reputation: 12950

I would also instinctively use str to do the replacements.

If you would like to do some changes in each subparts, you could alternatively break into parts, perform replacements, and then re-combine them:

path = Path(*(p.replace(old, new) for p in path.parts))

The performance of str looks better though

timeit

Upvotes: 0

deprecated
deprecated

Reputation: 2215

Expanding on ibykovsky's answer, the .with_name() and .with_stem() methods are a clean way of solving this in 3.9+. In common with Tonechas' answer, this avoids any risk of inadvertently changing the parent part.

For the example in the question, we can write:

import pathlib

p = pathlib.Path("data/demo_img.png")
# Simple case of appending extra text to the Path.stem does not need to use str.replace()
p_proc = p.with_stem(p.stem + "_proc")
# More complex case uses str.replace() on the Path.name
p_data = p.with_name(p.name.replace("_img.png", "_spreadsheet.csv"))

The first substitution is simply appending a string to the stem, so does not need to use str.replace(). Note, however that doing it via the stem means that the substitution will be made whatever the suffix (e.g., for .jpg as well as .png), which is different from the OP. This may or may not be what you want.

Swapping the suffix can be done with .with_suffix() but if substitutions are required in the parent parts, then this approach would become more unwieldy. There is no .with_parent() method, although .with_segments() is introduced in 3.12

Upvotes: 6

Tonechas
Tonechas

Reputation: 13743

I have recently faced a similar problem and found this thread when searching for a solution. In contrast to the accepted answer I did not convert the pathlib.Path object into a string. Instead, I used its parent and name attributes (name is a string itself), along with the joinpath() method. Here is the code:

In [2]: from pathlib import Path

In [3]: img_file_path = Path('data/demo_img.png')

In [4]: parent, name = img_file_path.parent, img_file_path.name

In [5]: proc_fn = name.replace('_img.png', '_img_proc.png')
   ...: data_fn = name.replace('_img.png', '_spreadsheet.csv')

In [6]: proc_img_file_path = Path(parent).joinpath(proc_fn)
   ...: data_img_file_path = Path(parent).joinpath(data_fn)

In [7]: proc_img_file_path
Out[7]: WindowsPath('data/demo_img_proc.png')

In [8]: data_img_file_path
Out[8]: WindowsPath('data/demo_spreadsheet.csv')

An advantage of this approach is that it avoids the risk of making unwanted replacements in the parent bit.

Upvotes: 3

J_H
J_H

Reputation: 20560

You are correct. To replace old with new in Path p, you need:

p = Path(str(p).replace(old, new))

EDIT

We turn Path p into str so we get this str method:

Help on method_descriptor:

replace(self, old, new, count=-1, /)

Return a copy with all occurrences of substring old replaced by new.

Otherwise we'd get this Path method:

Help on function replace in module pathlib:

replace(self, target)

Rename this path to the given path, clobbering the existing destination if it exists, and return a new Path instance pointing to the given path.

Upvotes: 42

Related Questions