Teemo
Teemo

Reputation: 71

pandas build a new column by list boolean

I need to build two new columns by list boolean. This is for mobile data classification.

sample data:

mobile_phone
85295649956
85398745632
8612345678945
34512654

There is my code:

import csv
import re
import pandas as pd
import numpy as np

df = pd.read_csv('test.csv',delimiter='|',dtype = str)

a = r'852[4-9]|853[4-9]|86'

print(list(map(lambda x: bool(re.match(a, x)), df['mobile_phone'])))

Now my response is:

[True,True,True,False]

I can list the boolean but I don't know how can I use this.

I tried something like this:

import csv
import re
import pandas as pd
import numpy as np

df = pd.read_csv('test.csv',delimiter='|',dtype = str)

a = r'852[4-9]|853[4-9]|86'
df['mobile'] = np.where(
     (lambda x: bool(re.match(a, x)), df['mobile_phone']) = True
    ,df['mobile_phone']
    ,nan
)
df['phone'] = np.where(
     (lambda x: bool(re.match(a, x)), df['mobile_phone']) = True,
    nan,
    df['mobile_phone']

)

I tried to use np.where but this can't work. Because this show me the error keyword can't be an experession

How can I show the result like this?

Desired result:

mobile_phone           mobile        phone
85295649956       85295649956          nan
85398745632       85398745632          nan
8612345678945   8612345678945          nan
34512654                  nan     34512654      

Upvotes: 0

Views: 63

Answers (1)

Nick
Nick

Reputation: 147166

You could just use Series.apply to process your values into new columns. For example:

import pandas as pd
import re
import math

df = pd.DataFrame({'mobile_phone': ['85295649956', '85398745632', '8612345678945', '34512654', '54861245'] })

a = r'852[4-9]|853[4-9]|86'
df['mobile'] = df['mobile_phone'].apply(lambda p: p if re.match(a, p) else math.nan)
df['phone'] = df['mobile_phone'].apply(lambda p: math.nan if re.match(a, p) else p)
df

Output:

    mobile_phone         mobile     phone
0    85295649956    85295649956      NaN
1    85398745632    85398745632      NaN
2  8612345678945  8612345678945      NaN
3       34512654           NaN       34512654
4       54861245           NaN       54861245

Upvotes: 1

Related Questions