user2293224
user2293224

Reputation: 2220

Python pandas: filtering one dataframe from another using contain and join statement

I have a data frame which looks like as follows:

df:

Noun    Thumb_count  
ability     19.0
account     3.0
accuracy    155.0
accurate    151.0
activity    163.0
adapt       3.0
app         15.0
gps         13.0

I have another dataframe which looks like follow:

df1:

Review Text                                         Noun        Thumbups    Rating  Review Date
This app is not working properly. GPS is showi...   app           34.0        2 August 3, 2020
This app is not working properly. GPS is showi...   gps           34.0        2 August 3, 2020
This app is not working properly. GPS is showi...   network       34.0        2 August 3, 2020
This app is not working properly. GPS is showi...   connectivity  34.0        2 August 3, 2020
This app is not working properly. GPS is showi...   signal        34.0        2 August 3, 2020

Now I want to keep the only rows of df1 where Noun column of df1 has same value as Noun column of df. Here is my code for filtering:

df1[df1.Noun.str.contains(('|').join(df.Noun.values.tolist()))]

When I ran the above command it throws following error:

error: nothing to repeat at position 2

I am not sure where I am making mistake. Could anyone guide me where I am doing mistake?

Upvotes: 0

Views: 75

Answers (1)

YOLO
YOLO

Reputation: 21709

I think you added an extra bracket, try:

df1[df1.Noun.str.contains('|'.join(df.Noun.tolist()))]

You can either use isin method:

df1[df1.Noun.isin(df.Noun)]

Upvotes: 2

Related Questions