whitefang1993
whitefang1993

Reputation: 1716

How to add extra column with values based on previous rows in Pandas data frame?

I have this data frame:

'C1'|'C2'
 0  | x
 1  | x1
 1  | x2 
 2  | x3
 0  | y
 1  | y1
 2  | y2
 0  | z
 1  | z1

I need to create an extra column like this:

'C1'|'C2'|'C3'
 0  | x  | x
 1  | x1 | x
 1  | x2 | x
 2  | x3 | x
 0  | y  | y
 1  | y1 | y
 2  | y2 | y 
 0  | z  | z
 1  | z1 | z

Basically when ever I find 0 in the C1 column, I have to put in all the sub rows (until the next 0) the corresponding value from the C2 column.

I am new to Pandas and I read that I should avoid manipulating the data frame using iterations.

How can a get this result without iterating? Is it possible?

Upvotes: 1

Views: 61

Answers (2)

prosti
prosti

Reputation: 46341

You may try this also:

df['C3']=df['C2'].astype(str).str[0]
print(df)

Upvotes: 2

jezrael
jezrael

Reputation: 862611

Use Series.where for mising values if not match condition with Series.eq (==) and forward filling missing values by ffill:

df['C3'] = df['C2'].where(df['C1'].eq(0)).ffill()
print (df)
   C1  C2 C3
0   0   x  x
1   1  x1  x
2   1  x2  x
3   2  x3  x
4   0   y  y
5   1  y1  y
6   2  y2  y
7   0   z  z
8   1  z1  z

Upvotes: 3

Related Questions