lucky1928
lucky1928

Reputation: 8849

python3 - apply a regex map to column

How to apply a regex to a data frame column?

import pandas as pd

df = pd.DataFrame({'col1': ['negative', 'positive', 'neutral', 'neutral', 'positive']})
cdict = {'n.*': -1, 'p.*': 0}
df['col2'] = df['col1'].map(cdict)

print(df.head())

Current output is:

:        col1  col2
: 0  negative   NaN
: 1  positive   NaN
: 2   neutral   NaN
: 3   neutral   NaN
: 4  positive   NaN

But expected results:

:        col1  col2
: 0  negative   -1
: 1  positive   1
: 2   neutral   -1
: 3   neutral   -1
: 4  positive   1

Upvotes: 2

Views: 67

Answers (2)

Mayank Porwal
Mayank Porwal

Reputation: 34076

To be honest, you don't need to have a dict for this at all. You can save on some space there.

Use numpy.select with Series.str.startswith:

In [1927]: import numpy as np

In [1928]: conds = [df.col1.str.startswith('n'), df.col1.str.startswith('p')]

In [1929]: choices = [-1, 0]

In [1930]: df['col2'] = np.select(conds, choices)

In [1931]: df
Out[1931]: 
       col1  col2
0  negative    -1
1  positive     0
2   neutral    -1
3   neutral    -1
4  positive     0

Upvotes: 2

anky
anky

Reputation: 75080

Instead of using a series.map use series.replace with regex=True

df['col2'] = df['col1'].replace(cdict,regex=True)

Upvotes: 4

Related Questions