bwrabbit
bwrabbit

Reputation: 529

How to search for a row in a dataframe that has all elements expressed by a dictionary?

I am trying to find out how to check if there is any row within my dataframe which has every item reflected in a dictionary. The dictionary may have more items which are not expressed in the dataframe rows.

For example,

import pandas as pd
validCombinations = { 'combo1':{ 'prop1': 'X', 'prop2':'Y', 'prop3': 'Z' },
                      'combo2':{ 'prop1': 'U', 'prop3': 'Z' },
                      'combo3':{ 'prop1': 'X', 'prop2':'X' },
                      'combo4':{ 'prop1': 'X', 'prop2':'Y', 'prop3': 'U' }
                    }
df = pd.DataFrame.from_dict( { k: validCombinations[k]
                               for k in validCombinations.keys() },                              
                             orient='index')
print df
>>
        prop1 prop2 prop3
combo1     X     Y     Z
combo2     U   NaN     Z
combo3     X     X   NaN
combo4     X     Y     U

mydict = { 'prop1': 'X', 'prop2':'Y', 'prop3': 'Z', 'prop4': 'V' }

I now want a function that returns True or False depending on whether a complete row from df was expressed by mydict,

dfRowMatchedMyDict( df, mydict)
--> this should return True since combo1 in df has all items expressed in mydict 

I am really not sure if I should be converting mydict to a dataframe first, or how to go about solving this in a clean way?

Upvotes: 2

Views: 53

Answers (3)

cs95
cs95

Reputation: 402854

Option 1
df.isin + df.all -

df[df.isin(list(mydict.values())).all(1)]

       prop1 prop2 prop3
combo1     X     Y     Z

Option 2
Using piR's hack with assign, filter on valid columns only and test with eq -

df[df.eq(df.assign(**mydict)[df.columns]).all(1)]

       prop1 prop2 prop3
combo1     X     Y     Z

Upvotes: 2

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210912

Try this:

In [116]: qry = ' and '.join(['{} == "{}"'.format(k,v) for k,v in mydict.items() if k in df.columns.tolist()])

In [117]: df.query(qry)
Out[117]:
       prop1 prop2 prop3
combo1     X     Y     Z

or:

In [121]: df.eval(qry)
Out[121]:
combo1     True
combo2    False
combo3    False
combo4    False
dtype: bool

Details:

In [119]: qry
Out[119]: 'prop1 == "X" and prop2 == "Y" and prop3 == "Z"'

UPDATE:

if your dict would contain only valid columns we could do the following:

In [137]: mydict2 = { 'prop1': 'X', 'prop2':'Y', 'prop3': 'Z'}

In [138]: df.eq(pd.Series(mydict2), axis=1)
Out[138]:
        prop1  prop2  prop3
combo1   True   True   True
combo2  False  False   True
combo3   True  False  False
combo4  False   True  False

In [139]: df.eq(pd.Series(mydict2), axis=1).all(1)
Out[139]:
combo1     True
combo2    False
combo3    False
combo4    False
dtype: bool

or using original dictionary:

In [141]: flt = pd.Series(mydict).loc[df.columns]

In [142]: flt
Out[142]:
prop1    X
prop2    Y
prop3    Z
dtype: object

In [143]: df.eq(flt, axis=1).all(1)
Out[143]:
combo1     True
combo2    False
combo3    False
combo4    False
dtype: bool

Upvotes: 2

Javier
Javier

Reputation: 420

Another option:

def dfRowMatchedMyDict(df,mydict):
    cols = set(mydict) & set(df.columns)
    trues = np.zeros(len(df))
    for column in cols:
        trues += df[column] == mydict[column]

    return trues == len(cols)

Out

dfRowMatchedMyDict(df,mydict)

combo1     True
combo2    False
combo3    False
combo4    False
Name: prop1, dtype: bool

Upvotes: 0

Related Questions