Qaswed
Qaswed

Reputation: 3879

How to use map with a dictionary having regular expression keys?

I have a DataFrame with a variable I want to map, using a dictionary where the keys are not "normal" strings, but regular expressions.

import pandas as pd
import re
df = pd.DataFrame({'cat': ['A1', 'A2', 'B1']})

What I would like to do is df['cat'].map({'A\d': 'a', 'B1': 'b'}), but A\d seems not be interpreted as a regex. In this simple MWE I could do df['cat'].map({'A1': 'a', 'A2': 'a', 'B1': 'b'}), but in the real world, the regex is much more complicated. Also the dictionary is much more complicated, so that the solution here (which requires to add start and end statementents and apply re.compile around the keys) is not feasable.

Upvotes: 6

Views: 1523

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150765

I'm not sure how complicated your dictionary is. But if it is not too long, we can just match and replace one by one:

maps = {'A\d': 'a', 'B1': 'b'}
(pd.concat((df['cat'].str.match(k) for k in maps), axis=1, ignore_index=True)
  .dot(pd.Series(d for k,d in maps.items()))
)

Output:

0    a
1    a
2    b
dtype: object

Upvotes: 1

piRSquared
piRSquared

Reputation: 294348

Use replace with regex=True

map takes a callable. When you pass it a dictionary it replaces the dictionary with lambda x: your_dict.get(x, x). For your purposes, replace is appropriate.

df.replace({'A\d': 'a', 'B1': 'b'}, regex=True)

  cat
0   a
1   a
2   b

Upvotes: 6

Related Questions