How to determine the logical truth value of N boolean Pandas columns?

Question

I have a pipeline which performs analysis on a table and adds extra features to classify that row of data. In this toy case I have table with features [id, x, y, z] and I'm adding has_adj. I can't figure how to determine the logical truth value of N columns (ie. the number of columns in the adjustment hunt could be N):

    id   x     y     z     n   has_adj_0  has_adj_1  has_adj_n
0   AX1  10.0  Adj     ..  True       False      ...
1   V0D  3.5       ..  False      False      ...
2   G7L  8.0     Adj   ..  False      True       ...

Finally, I set the feature df['has_adj'] = True where the row contains any True values, else False.

Here is the toy example to produce the above table:

import pandas as pd
import re

def hf_txn_has_adj(text, regex_dict):
    if pd.isna(text):
        return False

    rx = re.compile(regex_dict['regex_value'])
    result = rx.match(text)
    if rx.match(text):
        return True
    else:
        return False

regex_dict = {'regex_value': '(Adj)'}
df = pd.DataFrame([['AX1', 10, 'Adj', pd.NA], 
                   ['V0D', 3.5, pd.NA, pd.NA], 
                   ['G7L', 8, pd.NA, 'Adj']], 
                  columns=['id', 'x', 'y', 'z'])

for i, adj_feat in enumerate(['y', 'z']):
    df['has_adj_' + str(i)] = df[adj_feat].apply(hf_txn_has_adj, regex_dict=regex_dict)

Henry Ecker · Accepted Answer

Try filter + any on axis=1:

df['has_adj'] = df.filter(like='has_adj_').any(axis=1)

print(df)

df:

    id     x     y     z  has_adj_0  has_adj_1  has_adj
0  AX1  10.0   Adj         True      False     True
1  V0D   3.5          False      False    False
2  G7L   8.0     Adj      False       True     True

How to determine the logical truth value of N boolean Pandas columns?

Answers (1)

Related Questions