Counting rows in df against string

Question

I have a df with a feature that contains some pattern of +/- for the prior N days. For each row in the df, I'm trying to count the number of times that given pattern appears in a string (and add this count as a new column in df)

For example

d = {'day': [1, 2, 3], 'pattern': ['++-', '+++', '-+-']}
df = pd.DataFrame(data=d)
s = ('++-+++----++-+-')

For day 1 (row 1) I want it to search s for '++-' and return 3. For Day 2 it should return 1, etc.

In Excel this would be an easy countifs so I've been trying to use groupby().count(), or .str.contains but all of the examples I find have hardcoded strings to search for rather than iterating through the df to search for the pattern in that row.

Any help is appreciated!

Henry Ecker · Accepted Answer

Something like this should work.

import pandas as pd

d = {'day': [1, 2, 3], 'pattern': ['++-', '+++', '-+-']}
df = pd.DataFrame(data=d)
s = '++-+++----++-+-'

df['count'] = df['pattern'].apply(s.count)

print(df)

You can apply str count to each pattern in the DataFrame then assign it to your new column. Here I used count as the new column name.

Counting rows in df against string

Answers (2)

Related Questions