Reputation: 199
How can I filter a dataframe to rows with values that are contained within a list? Specifically, the values in the dataframe will only be partial matches with the list and never exact match.
I've tried using pandas.DataFrame.isin but this only works if the values in the dataframe are the same as in the list.
list = ["123 MAIN STREET", "456 BLUE ROAD", "789 SKY DRIVE"]
df =
address
0 123 MAIN
1 456 BLUE
2 987 PANDA
target_df = df[df["address"].isin(list)
Ideally the result should be
target_df =
address
0 123 MAIN
1 456 BLUE
Upvotes: 2
Views: 1231
Reputation: 59274
Use str.contains
and a simple regex using |
to connect the terms.
f = '|'.join
mask = f(map(f, map(str.split, list)))
df[df.address.str.contains(mask)]
address
0 123 MAIN
1 456 BLUE
Upvotes: 2
Reputation: 323316
Ending up using for loop
df[[any(x in y for y in l) for x in df.address]]
Out[257]:
address
0 123 MAIN
1 456 BLUE
Upvotes: 1