Reputation: 17631
I have the following pandas dataframe:
import pandas as pd
data = {"first_name": ["Alexander", "Alan", "Heather", "Marion", "Amy", "John"],
"last_name": ["Miller", "Jacobson", ".", "Milner", "Cooze", "Smith"],
"age": [42, 52, 36, 24, 73, 19],
"marriage_status" : [0, 0, 1, 1, 0, 1]}
df = pd.DataFrame(data)
df
age first_name last_name marriage_status
0 42 Alexander Miller 0
1 52 Alan Jacobson 0
2 36 Heather . 1
3 24 Marion Milner 1
4 73 Amy Cooze 0
5 19 John Smith 1
....
The column marriage_status
is a column of binary data, 0 and 1. Before each 1
, I would like to make the preceding row a 1
as well. In this example, the dataframe would become:
age first_name last_name marriage_status
0 42 Alexander Miller 0
1 52 Alan Jacobson 1 # this changed to 1
2 36 Heather . 1
3 24 Marion Milner 1
4 73 Amy Cooze 1 # this changed to 1
5 19 John Smith 1
....
In other words, there are "groups" of consecutive ones in this column, and I woudl like to make the preceding row element 1 instead of 0. How can I do this?
My thought was to somehow create a for statement, but this isn't a pandas-based solution. One could also try enumerate()
, but then I need to make the preceding value 1; without addition, I'm not sure how this works.
Upvotes: 2
Views: 313
Reputation: 294338
We can use the or
operator |
. It will treat the 1
s as True
and 0
s as False
. |
with evaluate to False
when we have a 0
in a row and a 0
in the next row.
df.marriage_status = (
df.marriage_status | df.marriage_status.shift(-1)
).astype(int)
df
age first_name last_name marriage_status
0 42 Alexander Miller 0
1 52 Alan Jacobson 1
2 36 Heather . 1
3 24 Marion Milner 1
4 73 Amy Cooze 1
5 19 John Smith 1
Upvotes: 4
Reputation: 210852
you can use Series.shift(-1) method:
In [21]: df.loc[df.marriage_status.shift(-1) == 1, 'marriage_status'] = 1
In [22]: df
Out[22]:
age first_name last_name marriage_status
0 42 Alexander Miller 0
1 52 Alan Jacobson 1
2 36 Heather . 1
3 24 Marion Milner 1
4 73 Amy Cooze 1
5 19 John Smith 1
Upvotes: 4