How to check entries in a column for patterns and calculate the number of patterns?

Question

I have a DataFrame:

         Name         Price
0        Dictionary     3
1        Book           4
2        Dict En-Ru     2
3        BookforKids    6
4        Dict FR-CHN    1

I need a piece of code that will check the column 'Name' for patterns that I can specify myself and will count the number of founded patterns in another DataFrame.

For instance, check the number of entries in the 'Name' column with the patterns Dict an Book ignoring the case should give this result:

|  Pattern    | Occurencies |
| ----------- | ----------- |
| Dict        | 3           |
| Book        | 2           |

yatu · Accepted Answer

Here's one way using str.extract:

patterns = ['Dict','Book']
df.Name.str.extract(rf"({'|'.join(patterns)})", expand=False).value_counts()

Dict    3
Book    2
Name: 0, dtype: int64

You can make it case insensitive with the flags argument:

patterns_lower = '|'.join([s.lower() for s in patterns])
(df.Name.str.lower().str.extract(rf"({patterns_lower})", expand=False)
        .value_counts())

How to check entries in a column for patterns and calculate the number of patterns?

Answers (2)

Related Questions