Reputation: 571
I have a DataFrame:
Name Price
0 Dictionary 3
1 Book 4
2 Dict En-Ru 2
3 BookforKids 6
4 Dict FR-CHN 1
I need a piece of code that will check the column 'Name' for patterns that I can specify myself and will count the number of founded patterns in another DataFrame.
For instance, check the number of entries in the 'Name' column with the patterns Dict an Book ignoring the case should give this result:
| Pattern | Occurencies |
| ----------- | ----------- |
| Dict | 3 |
| Book | 2 |
Upvotes: 0
Views: 50
Reputation: 88275
Here's one way using str.extract
:
patterns = ['Dict','Book']
df.Name.str.extract(rf"({'|'.join(patterns)})", expand=False).value_counts()
Dict 3
Book 2
Name: 0, dtype: int64
You can make it case insensitive with the flags
argument:
patterns_lower = '|'.join([s.lower() for s in patterns])
(df.Name.str.lower().str.extract(rf"({patterns_lower})", expand=False)
.value_counts())
Upvotes: 2
Reputation: 4618
You can define your pattern as a custom function:
# example
def get_pattern(txt):
if 'Dict' in txt:
return 'Dict'
if 'Book' in txt:
return 'Book'
return np.nan
Then you apply in your dataframe and use value counts:
df['Pattern'] = df['Name'].apply(get_pattern)
df['Pattern'].value_counts()
Dict 3
Book 2
dtype: int64
Upvotes: 0