Filter dataframe by value only if the value is not a specific value

Question

I want to filter a dataframe by values received in JSON like below:

region = request_json['region_value']
ma = request_json['ma_value']
market = request_json['market_value']
subsegment = request_json['subsegment_value']

filtered =df[
    (df['Region'] == region) &
    (df['MA/Segment'] == ma) &
    (df['Market'] == market) &
    (df['Subsegment'] == subsegment)
]

However, if region, ma, market or subsegment equal 'All', I don't want to include it in the filtering.

So for example if region = 'All', the code should be like this:

filtered =df[
        (df['MA/Segment'] == ma) &
        (df['Market'] == market) &
        (df['Subsegment'] == subsegment)
    ]

I would like to avoid multiple ifs and go for a more elegant solution instead. I would welcome any ideas.

cs95 · Accepted Answer

You can use np.logical_and.reduce; this will work even if your column names have spaces, special characters, etc:

conditions = {
    'Region': region, 'MA/Segment': ma, 'Market': market, 'Subsegment': subsegment}

mask = pd.np.logical_and.reduce([
    df[k] == v for k, v in conditions.items() if v != 'All'])
df[mask]

Another option is query, but this will work assuming your column names conform to valid python identifier names.

query = ' and '.join([
    f'{k}=={repr(v)}' for k, v in conditions.items() if v != 'All'])    
df.query(query)

Filter dataframe by value only if the value is not a specific value

Answers (2)

Related Questions