xtian
xtian

Reputation: 2947

How to determine the logical truth value of N boolean Pandas columns?

I have a pipeline which performs analysis on a table and adds extra features to classify that row of data. In this toy case I have table with features [id, x, y, z] and I'm adding has_adj. I can't figure how to determine the logical truth value of N columns (ie. the number of columns in the adjustment hunt could be N):

    id   x     y     z     n   has_adj_0  has_adj_1  has_adj_n
0   AX1  10.0  Adj   <NA>  ..  True       False      ...
1   V0D  3.5   <NA>  <NA>  ..  False      False      ...
2   G7L  8.0   <NA>  Adj   ..  False      True       ...

Finally, I set the feature df['has_adj'] = True where the row contains any True values, else False.

Here is the toy example to produce the above table:

import pandas as pd
import re

def hf_txn_has_adj(text, regex_dict):
    if pd.isna(text):
        return False

    rx = re.compile(regex_dict['regex_value'])
    result = rx.match(text)
    if rx.match(text):
        return True
    else:
        return False

regex_dict = {'regex_value': '(Adj)'}
df = pd.DataFrame([['AX1', 10, 'Adj', pd.NA], 
                   ['V0D', 3.5, pd.NA, pd.NA], 
                   ['G7L', 8, pd.NA, 'Adj']], 
                  columns=['id', 'x', 'y', 'z'])

for i, adj_feat in enumerate(['y', 'z']):
    df['has_adj_' + str(i)] = df[adj_feat].apply(hf_txn_has_adj, regex_dict=regex_dict)

Upvotes: 3

Views: 58

Answers (1)

Henry Ecker
Henry Ecker

Reputation: 35676

Try filter + any on axis=1:

df['has_adj'] = df.filter(like='has_adj_').any(axis=1)

print(df)

df:

    id     x     y     z  has_adj_0  has_adj_1  has_adj
0  AX1  10.0   Adj  <NA>       True      False     True
1  V0D   3.5  <NA>  <NA>      False      False    False
2  G7L   8.0  <NA>   Adj      False       True     True

Upvotes: 4

Related Questions