Reputation: 2012
I have for example the following input DataFrame:
> df = pandas.DataFrame({'x': [1, 6, 8, 5, 2, 6, 12]})
> df
x
0 1
1 6
2 8
3 5
4 2
5 6
6 12
And I would like to create the column y such that:
y[i] = 0
if x < 4
,
y[i] = 1
if x > 6
and y[i] = y[i - 1]
if 4 <= x <= 6
So that with the example above the output would be:
x y
0 1 0
1 6 0
2 8 1
3 5 1
4 2 0
5 6 0
6 12 1
What is the best way to do this? A simple apply()
does not seem to work as I did not find a way to reference a previously computed value in the column that is being created by the apply()
.
Upvotes: 1
Views: 44
Reputation: 77971
You may use np.select
followed by .fillna
:
>>> df['y'] = np.select([df['x'] < 4, 6 < df['x']], [0, 1], np.nan)
>>> df['y'] = df['y'].fillna(method='ffill').astype('int')
>>> df
x y
0 1 0
1 6 0
2 8 1
3 5 1
4 2 0
5 6 0
6 12 1
Upvotes: 1