Reputation: 23036
Does the pandas.DataFrame.replace
regex replace support wildcards and "capture groups"?
E.g., to replace ([A-Z])(\w+)
with \2\1
?
What kind of regular expression is supported? Does Perl's regex supported? E.g., OK to replace ([A-Z])(\w+)
with \l\1\2
(\l
: Change the next character to lowercase.)
UPDATE:
As Steve has pointed out, according to the Python documentation, it should work, but the following is not giving me what I expected:
df = pd.DataFrame({'A': np.random.choice(['foo', 'bar'], 100),
'B': np.random.choice(['one', 'two', 'three'], 100),
'C': np.random.choice(['I1', 'I2', 'I3', 'I4'], 100),
'D': np.random.randint(-10,11,100),
'E': np.random.randn(100)})
df.replace("f(.)(.)","b\1\2", regex=True,inplace=True)
What's wrong?
Thx
Upvotes: 3
Views: 2059
Reputation: 43743
According to the pandas documentation:
Regex substitution is performed under the hood with re.sub. The rules for substitution for re.sub are the same.
So, yes, any substitutions which can be performed with Python's re.sub
(such as \1
) can also be performed with pandas.DataFrame.replace
. See the Python documentation for more information.
Upvotes: 3