brooks08
brooks08

Reputation: 39

Counting rows in df against string

I have a df with a feature that contains some pattern of +/- for the prior N days. For each row in the df, I'm trying to count the number of times that given pattern appears in a string (and add this count as a new column in df)

For example

d = {'day': [1, 2, 3], 'pattern': ['++-', '+++', '-+-']}
df = pd.DataFrame(data=d)
s = ('++-+++----++-+-')

For day 1 (row 1) I want it to search s for '++-' and return 3. For Day 2 it should return 1, etc.

In Excel this would be an easy countifs so I've been trying to use groupby().count(), or .str.contains but all of the examples I find have hardcoded strings to search for rather than iterating through the df to search for the pattern in that row.

Any help is appreciated!

Upvotes: 0

Views: 54

Answers (2)

Henry Ecker
Henry Ecker

Reputation: 35626

Something like this should work.

import pandas as pd

d = {'day': [1, 2, 3], 'pattern': ['++-', '+++', '-+-']}
df = pd.DataFrame(data=d)
s = '++-+++----++-+-'

df['count'] = df['pattern'].apply(s.count)

print(df)

You can apply str count to each pattern in the DataFrame then assign it to your new column. Here I used count as the new column name.

Upvotes: 1

ThePyGuy
ThePyGuy

Reputation: 18406

You can use str.count along with pandas' apply

>>df['pattern'].apply(s.count)
0    3
1    1
2    1
Name: pattern, dtype: int64

Upvotes: 3

Related Questions