Reputation: 2031
How do I find all the substrings matching a df
column value?
text = "The quick brown fox jumps over the lazy dog"
df = pd.DataFrame(['quick brown fox', 'jump', 'lazy dog', 'banana', 'quick fox'], columns=['value'])
results = get_matches(df, text)
# Excepted results: ['quick brown fox', 'jump', 'lazy dog']
Upvotes: 0
Views: 48
Reputation: 927
List=[]
for a in df.value:
if a in text:
print(a)
List.append(a)
print(List)
Upvotes: 0
Reputation: 61900
One option:
import pandas as pd
text = "The quick brown fox jumps over the lazy dog"
df = pd.DataFrame(['quick brown fox', 'jump', 'lazy dog', 'banana', 'quick fox'], columns=['value'])
def get_matches(df, text):
return df[df['value'].apply(text.__contains__)]
res = get_matches(df, text)
print(res)
Output
value
0 quick brown fox
1 jump
2 lazy dog
As an alternative, use str.find:
def get_matches(df, text):
return df[df['value'].apply(text.find).ne(-1)]
res = get_matches(df, text)
print(res)
Output
value
0 quick brown fox
1 jump
2 lazy dog
Upvotes: 4
Reputation: 150735
Try:
def get_matches(df, text):
return df.loc[[t in text for t in df['value']], 'value']
get_matches(df, text)
Output:
0 quick brown fox
1 jump
2 lazy dog
Name: value, dtype: object
Upvotes: 1