Reputation: 199
When I use the regex=[True|False]
flag in the pd.Series.str.replace()
method, I get contradictory exceptions:
repl
is a dictionary => it says repl must be a string or callable
repl
is a callable => it says Cannot use a callable replacement when regex=False
I'm trying to replace the month part of a Spanish date from a DataFrame index with the corresponding English short name.
import pandas as pd
import numpy as np
# Define the months' short names in English and Spanish
ENG = ['JAN', 'FEB', 'MAR', 'APR', 'MAY', 'JUN', 'JUL', 'AUG', 'SEP', 'OCT', 'NOV', 'DEC']
ESP = ['ENE', 'FEB', 'MAR', 'ABR', 'MAY', 'JUN', 'JUL', 'AGO', 'SEP', 'OCT', 'NOV', 'DIC']
# Dictionary mapping Spanish months to English months
esp2eng = dict(zip(ESP, ENG))
# Function to make the dictionary "callable"
def eng_from_esp(key):
return esp2eng[key]
# Create the DF with date in the "%d-%b-%y" format as index, where %b is the Spanish naming
idx = ['06-{}-19'.format(m) for m in ESP]
col = ['ordinal']
data = pd.DataFrame(np.arange(12).reshape((12, 1)),
index=idx,
columns=col)
data.index.str.replace('ENE', esp2eng, regex=False)
TypeError: repl must be a string or callable
data.index.str.replace('ENE', eng_from_esp, regex=False)
ValueError: Cannot use a callable replacement when regex=False
Upvotes: 4
Views: 17835
Reputation: 43524
If you look at the documentation for pandas.Series.str.replace
you will see that the repl
argument can be a string or callable, but a dict
is not supported.
With that in mind, your first attempt is not supported.
Digging into the source code (key parts reproduced below), you still see that the check for string
or callable
is done first, before checking the regex
flag.
# Check whether repl is valid (GH 13438, GH 15055)
if not (is_string_like(repl) or callable(repl)):
raise TypeError("repl must be a string or callable")
if regex:
# omitted
else:
# omitted
if callable(repl):
raise ValueError("Cannot use a callable replacement when "
"regex=False")
So your first attempt (using a dictionary for repl
) trips the first if
check prints the message that "repl must be a string or callable"
.
Your second attempt passes this check, but then gets tripped by the check for a callable inside the else
block of the regex
check.
So in short, there is no inconsistency. Sure the first error message could potentially be improved to say something like "repl must be a string or callable (unless you're using regex=False)"
but that's not really necessary.
FWIW, here is a pandas "one-liner" that should achieve the desired result:
print(
data.reset_index()
.replace(esp2eng, regex=True)
.set_index("index", drop=True)
.rename_axis(None, axis=0)
)
# ordinal
#06-JAN-19 0
#06-FEB-19 1
#06-MAR-19 2
#06-APR-19 3
#06-MAY-19 4
#06-JUN-19 5
#06-JUL-19 6
#06-AUG-19 7
#06-SEP-19 8
#06-OCT-19 9
#06-NOV-19 10
#06-DEC-19 11
Upvotes: 3