coding_monkey
coding_monkey

Reputation: 399

Create a column which increments value for changes in another row

I have a dataframe with two columns as below:

Var1Var2
a   28
b   28
d   28
f   29
f   29
e   30
b   30
m   30
l   30
u   31
t   31
t   31

I'd like to create a third column with values which increases by one for every change in value of another column.

Var1Var2Var3
a   28  1
b   28  1
d   28  1
f   29  2
f   29  2
e   30  3
b   30  3
m   30  3
l   30  3
u   31  4
t   31  4
t   31  4

How would I go about doing this?

Upvotes: 11

Views: 8805

Answers (3)

BENY
BENY

Reputation: 323286

Using category

df.Var2.astype('category').cat.codes.add(1)
Out[525]: 
0     1
1     1
2     1
3     2
4     2
5     3
6     3
7     3
8     3
9     4
10    4
11    4
dtype: int8

Updated

from itertools import groupby
grouped = [list(g) for k, g in groupby(df.Var2.tolist())]
np.repeat(range(len(grouped)),[len(x) for x in grouped])+1

Upvotes: 11

cs95
cs95

Reputation: 402563

You can compare Var2 with its shifted-by-1 version:

v
   Var1  Var2
a     0    28
b     1    28
d     2    28
f     3    30
f     4    30
e     5     2
b     6     2
m     7     2
l     8     2
u     9     5
t    10     5
t    11     5

i = v.Var2    
v['Var3'] = i.ne(i.shift()).cumsum()

v
   Var1  Var2  Var3
a     0    28     1
b     1    28     1
d     2    28     1
f     3    30     2
f     4    30     2
e     5     2     3
b     6     2     3
m     7     2     3
l     8     2     3
u     9     5     4
t    10     5     4
t    11     5     4

Upvotes: 13

John Zwinck
John Zwinck

Reputation: 249253

Something like this:

(df.Var2.diff() != 0).cumsum()

Upvotes: 7

Related Questions