ShanZhengYang
ShanZhengYang

Reputation: 17631

Given a binary column in a pandas dataframe, how I change the preceding 0 to 1?

I have the following pandas dataframe:

import pandas as pd

data = {"first_name": ["Alexander", "Alan", "Heather", "Marion", "Amy", "John"],
            "last_name": ["Miller", "Jacobson", ".", "Milner", "Cooze", "Smith"],
            "age": [42, 52, 36, 24, 73, 19],
                "marriage_status" : [0, 0, 1, 1, 0, 1]}

df = pd.DataFrame(data)
df

  age first_name last_name  marriage_status
0   42  Alexander    Miller                0
1   52       Alan  Jacobson                0
2   36    Heather         .                1
3   24     Marion    Milner                1
4   73        Amy     Cooze                0
5   19       John     Smith                1
....

The column marriage_status is a column of binary data, 0 and 1. Before each 1, I would like to make the preceding row a 1 as well. In this example, the dataframe would become:

  age first_name last_name  marriage_status
0   42  Alexander    Miller                0
1   52       Alan  Jacobson                1   # this changed to 1
2   36    Heather         .                1
3   24     Marion    Milner                1
4   73        Amy     Cooze                1   # this changed to 1
5   19       John     Smith                1
....

In other words, there are "groups" of consecutive ones in this column, and I woudl like to make the preceding row element 1 instead of 0. How can I do this?

My thought was to somehow create a for statement, but this isn't a pandas-based solution. One could also try enumerate(), but then I need to make the preceding value 1; without addition, I'm not sure how this works.

Upvotes: 2

Views: 313

Answers (2)

piRSquared
piRSquared

Reputation: 294338

We can use the or operator |. It will treat the 1s as True and 0s as False. | with evaluate to False when we have a 0 in a row and a 0 in the next row.

df.marriage_status = (
    df.marriage_status | df.marriage_status.shift(-1)
).astype(int)

df

   age first_name last_name  marriage_status
0   42  Alexander    Miller                0
1   52       Alan  Jacobson                1
2   36    Heather         .                1
3   24     Marion    Milner                1
4   73        Amy     Cooze                1
5   19       John     Smith                1

Upvotes: 4

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210852

you can use Series.shift(-1) method:

In [21]: df.loc[df.marriage_status.shift(-1) == 1, 'marriage_status'] = 1

In [22]: df
Out[22]:
   age first_name last_name  marriage_status
0   42  Alexander    Miller                0
1   52       Alan  Jacobson                1
2   36    Heather         .                1
3   24     Marion    Milner                1
4   73        Amy     Cooze                1
5   19       John     Smith                1

Upvotes: 4

Related Questions