Christophe
Christophe

Reputation: 2012

Create a column in a pandas DataFrame using the previously computed value

I have for example the following input DataFrame:

> df = pandas.DataFrame({'x': [1, 6, 8, 5, 2, 6, 12]})
> df
    x
0   1
1   6
2   8
3   5
4   2
5   6
6  12

And I would like to create the column y such that:

y[i] = 0 if x < 4,

y[i] = 1 if x > 6

and y[i] = y[i - 1] if 4 <= x <= 6

So that with the example above the output would be:

    x  y
0   1  0
1   6  0
2   8  1
3   5  1
4   2  0
5   6  0
6  12  1

What is the best way to do this? A simple apply() does not seem to work as I did not find a way to reference a previously computed value in the column that is being created by the apply().

Upvotes: 1

Views: 44

Answers (1)

behzad.nouri
behzad.nouri

Reputation: 77971

You may use np.select followed by .fillna:

>>> df['y'] = np.select([df['x'] < 4, 6 < df['x']], [0, 1], np.nan)
>>> df['y'] = df['y'].fillna(method='ffill').astype('int')
>>> df
    x  y
0   1  0
1   6  0
2   8  1
3   5  1
4   2  0
5   6  0
6  12  1

Upvotes: 1

Related Questions