Nobel
Nobel

Reputation: 1555

Filter pandas dataframe with SequenceMatcher

When I filter the data frame using the the code below, it works fine

my_df.loc[lambda x:x["name"]=="space"]

When I filter using the following code, it gives an error

my_df.loc[lambda x: difflib.SequenceMatcher(None,"email",x["name"]).ratio()>0.8]

I want to filter using SequenceMatcher and maybe using more complex condition than the above one

Here is the full code:

import pandas as pd
import difflib
my_df=pd.DataFrame({"name":["space","mapp","eemail","daata"],"id":[9,12,13,14]})
my_df.loc[lambda x:x["name"]=="space"] #this line works
my_df.loc[lambda x: difflib.SequenceMatcher(None,"email",x["name"]).ratio()>0.8] #this doesn't

Upvotes: 1

Views: 1004

Answers (1)

gyx-hh
gyx-hh

Reputation: 1431

Try the following:

my_df.loc[my_df['name'].apply(lambda x: difflib.SequenceMatcher(None,"email",x).ratio()) > 0.8]

Output:

    id  name
2   13  eemail

Upvotes: 2

Related Questions