Andy Winhold
Andy Winhold

Reputation: 3

Add number to column each time a different column has group of True bools

I have two columns I am working with. The first column is populated with zeros and the second column is populated with booleans.

column 1           column 2
0                  True
0                  True
0                  False
0                  True
0                  True
0                  False
0                  False
0                  True

There are millions of rows so I am trying to figure an efficient process that looks at column 2 and for each grouping of True bools adds 1 to column 1.

column 1           column 2
1                  True
1                  True
0                  False
2                  True
2                  True
0                  False
0                  False
3                  True

Any help is much appreciated!

Upvotes: 0

Views: 75

Answers (2)

piRSquared
piRSquared

Reputation: 294218

df['column 3'] = (df['column 2'] & (df['column 2'].shift() != True))
df['column 4'] = df['column 3'].cumsum()

df['column 1'] = df['column 2'] * df['column 4']

print df

   column 1 column 2 column 3  column 4
0         1     True     True         1
1         1     True    False         1
2         0    False    False         1
3         2     True     True         2
4         2     True    False         2
5         0    False    False         2
6         0    False    False         2
7         3     True     True         3

Upvotes: 0

DSM
DSM

Reputation: 353009

One trick which often comes in handy when vectorizing operations on contiguous groups is the shift-cumsum pattern:

>>> c = df["column 2"]
>>> c * (c & (c != c.shift())).cumsum()
0    1
1    1
2    0
3    2
4    2
5    0
6    0
7    3
Name: column 2, dtype: int32

Upvotes: 3

Related Questions