Pandas: Search and return data frame that includes specific words in a column

Question

I have a dataframe of something like the following structure:

NDB_No  Shrt_Desc   Water_(g)   Energ_Kcal  Protein_(g) ...   
01001   BUTTER,WITH SALT    15,87   717 0,85  
01002   BUTTER,WHIPPED,W/ SALT  16,72   718 0,49  
...  
01004   CHEESE,BLUE 42,41   353 21,4    28,74  
01005   CHEESE,BRICK    41,11   371 23,24   29,68

I want to get a dataframe that includes only the rows where in the Shrt_Desc column has items that are in the list to_be_found = [BUTTER, PASTA, ..etc] but not CHEESE
The word to be found (in the list above) could be anywhere in the Shrt_Desc, not necessarily in the beginning, like SALT above.

How should I approach this?
Thanks!

Paradigm · Accepted Answer

The following piece of code resolves the issue (based on @piRSquared hint above).

import pandas as pd
from collections import Counter

food_info = pd.read_excel("ABBREV.xlsx")
dfi_1 = food_info


to_be_found = ['BUTTER', 'CHEESE', 'MILK', 'OIL', 'CORN', 'SALT', 'INF', 'PEPPER', 'PASTA', 'GLUTEN-FREE']
found = []
dfi_6 =  dfi_1.Shrt_Desc.str.split(',')
for row in dfi_6.iteritems():
    for x in to_be_found:
        if x in row[1]:
            found.append(x)

print(found)
print(len(found))

c = Counter(found)
print(c)

Pandas: Search and return data frame that includes specific words in a column

Answers (2)

Related Questions