Pandas: Select rows that contain any substring from a list

Question

I would like to select those rows in a column that contains any of the substrings in a list. This is what I have for now.

product = ['LID', 'TABLEWARE', 'CUP', 'COVER', 'CONTAINER', 'PACKAGING']

df_plastic_prod = df_plastic[df_plastic['Goods Shipped'].str.contains(product)]

df_plastic_prod.info()

Sample df_plastic

Name          Product
David        PLASTIC BOTTLE
Meghan       PLASTIC COVER
Melanie      PLASTIC CUP 
Aaron        PLASTIC BOWL
Venus        PLASTIC KNIFE
Abigail      PLASTIC CONTAINER
Sophia       PLASTIC LID

Desired df_plastic_prod

Name          Product
Meghan       PLASTIC COVER
Melanie      PLASTIC CUP 
Abigail      PLASTIC CONTAINER
Sophia       PLASTIC LID

Thanks in advance! I appreciate any assistance on this!

jezrael · Accepted Answer

For match values by subtrings join all values of list by | for regex or - so get values LID or TABLEWARE ...:

Solution working well also with 2 or more words in list.

pat = '|'.join(r"\b{}\b".format(x) for x in product)
df_plastic_prod = df_plastic[df_plastic['Product'].str.contains(pat)]
print (df_plastic_prod)
      Name            Product
1   Meghan      PLASTIC COVER
2  Melanie        PLASTIC CUP
5  Abigail  PLASTIC CONTAINER
6   Sophia        PLASTIC LID

Pandas: Select rows that contain any substring from a list

Answers (2)

Related Questions