Javier Cordero
Javier Cordero

Reputation: 13

Use the apply function in Pandas to use a Regex count per row

I have a Pandas df that has this structure:

Store CID          UnitsOH                                        Count

1   23095   17_17_17_16_16_15_15_15_15_15_13_12_10_9_8_7_7...   15982

23101   6_6_5_5_5_5_4_3_3_3_7_6_5_5_5_5_5_5_3_2_2_5_5_...   15982

23117   6_6_6_6_6_6_6_6_6_6_6_6_5_5_5_4_3_3_3_3_3_3_3_...   15982

23161   6_6_6_6_6_6_6_6_6_6_6_5_5_5_4_4_4_4_4_3_3_3_3_...   15982

23222   5_5_5_5_5_5_5_5_4_4_4_4_3_3_3_3_3_3_3_3_3_3_7_...   15982

I need to count how many times a specific Pattern happens on that "Units OH" column. For example, need to count how many times every row has any positive number followed by 0. I used a "_" separator when I concatenated the field, so I'm looking for a Pattern of '[1-9][0]__' (Sorry about the format... first post here and don't understand how to format the text correctly).

I used this code to create that last column called 'Count':


ConcatOH['Count'] = ConcatOH['Units_OH'].str.count('_[1-9]_[0]_').sum()

However, as you can see, it seems that the the count is counting through the entire dataframe and giving me the same count for every row. How can I do the count by row only. is there an axis=0 argument I could use somewhere or can somebody help me with how to use the apply method to this?

Upvotes: 1

Views: 192

Answers (2)

oppressionslayer
oppressionslayer

Reputation: 7224

Javier, do you mean something like this:

import re
ConcatOH['Units_OH'].apply(lambda x: len(re.findall('_[\d+]_0', x)))

Upvotes: 0

Kenan
Kenan

Reputation: 14104

Remove the .sum() at the end of ConcatOH['Units_OH'].str.count('_[1-9]_[0]_').sum()

ConcatOH['Units_OH'].str.count('_[1-9]_[0]_') returns a series and then your summing it to get an int and that is assigned to ConcatOH['Count'] hence why you have the same value for each row

Your basically doing

ConcatOH['Count'] = 15982

You want

ConcatOH['Count'] = ConcatOH['Units_OH'].str.count('_[1-9]_[0]_')

Upvotes: 1

Related Questions