Rafi
Rafi

Reputation: 575

How to shift a cell depending on a value in another column in python?

I have a dataframe as follows:

id | time
1  | 10:21
1  | 10:22
1  | 10:23
2  | 10:40
2  | 10:45
2  | 10:50

I would like to add a new column as follows:

id | time  | new_time
1  | 10:21 | 10:22
1  | 10:22 | 10:23
1  | 10:23 | None
2  | 10:40 | 10:45
2  | 10:45 | 10:50
2  | 10:50 | None

That means, I would like to create the new column by matching the values in the id column. For example, if the id values for two consecutive rows are the same then I would like to add the value in the new_time (in the first column) from the time value of the second column. If the id values are not the same then I would like to add None for the new_time value.How can I achieve this using python or pandas?

Upvotes: 5

Views: 1944

Answers (2)

Pluckerpluck
Pluckerpluck

Reputation: 731

Assuming ID is currently the index, I would try:

df = df.reset_index()
shifted = df.shift(-1)
df['new_time'] = shifted.time
df.loc[df.id != shifted.id, "new_time"] = None
df = df.set_index("id")

If it's not an index you can just skip the reset_index and set_index lines.

It basically shifts the entire dataframe, matches them up together and wherever IDs are no longer the same it sets those values to None.

Upvotes: 2

vercelli
vercelli

Reputation: 4757

Use .shift() to get next record:

df['new_time'] = df.shift(-1).time

Results:

    id  time    new_time
0   1   10:21   10:22
1   1   10:22   10:23
2   1   10:23   10:40
3   2   10:40   10:45
4   2   10:45   10:50
5   2   10:50   NaN

Then assign np.NaN to the last row for each id

df.loc[df.groupby('id', as_index= False).nth([-1]).index, 'new_time'] = np.NaN

Results:

id  time    new_time
0   1   10:21   10:22
1   1   10:22   10:23
2   1   10:23   NaN
3   2   10:40   10:45
4   2   10:45   10:50
5   2   10:50   NaN

Upvotes: 6

Related Questions