Deepak Kumar
Deepak Kumar

Reputation: 59

Create new number variable using pandas?

I have df given below.

df

VAR FLAG
A      1
A      1
A      1
B      1
B      1
B      0
B      0
B      1
B      1
B      1
B      0
C      0
C      0

I want to create a new variable named FLAG1 using VAR and FLAG. The final output given below.

Final Output: -

VAR FLAG    FLAG1
A      1        1
A      1        1
A      1        1
B      1        1
B      1        1
B      0        2
B      0        3
B      1        4
B      1        4
B      1        4
B      0        5
C      0        1
C      0        2

Upvotes: 1

Views: 35

Answers (2)

Scott Boston
Scott Boston

Reputation: 153510

You can using this bit of logic:

df['FLAG1'] =  (df.groupby('VAR')['FLAG'] 
                  .transform(lambda x: ((x != x.shift()) | (x == 0)).cumsum()))

Output:

   VAR  FLAG  FLAG1
0    A     1      1
1    A     1      1
2    A     1      1
3    B     1      1
4    B     1      1
5    B     0      2
6    B     0      3
7    B     1      4
8    B     1      4
9    B     1      4
10   B     0      5
11   C     0      1
12   C     0      2

Upvotes: 3

BENY
BENY

Reputation: 323356

Try something new factorize

pd.Series(zip(df['FLAG'].eq(0).cumsum(),df.FLAG)).groupby(df['VAR']).transform(lambda x : x.factorize()[0]+1)
Out[72]: 
0     1
1     1
2     1
3     1
4     1
5     2
6     3
7     4
8     4
9     4
10    5
11    1
12    2
dtype: int32

Upvotes: 1

Related Questions