Tingiskhan
Tingiskhan

Reputation: 1345

Change first value of group based on condition, pandas

Assume that I have the following dataset

table = [[datetime.datetime(2015, 3, 31), 1, 0.5, 1],
         [datetime.datetime(2015, 6, 30), 1, 0.5, 0.5],
         [datetime.datetime(2015, 9, 30), 1, 0.5, 0.5],
         [datetime.datetime(2015, 12, 31), 1, 2, 0.5],
         [datetime.datetime(2015, 3, 31), 2, 0.5, 1.5],
         [datetime.datetime(2015, 6, 30), 2, 0.5, 0.5],
         [datetime.datetime(2015, 9, 30), 2, 0.5, 0.5],
         [datetime.datetime(2015, 12, 31), 2, 2, 0.5]]

df = pd.DataFrame(table, columns=['Date', 'Id', 'Value', 'Old'])

Is there any way to change the first element of Value to the corresponding element of Old if the element of Value is smaller than the one in Old? It needs to be done by each group (based on Id). My new table would thus look like

        Date  Id  Value  Old
0 2015-03-31   1    1.0  1.0
1 2015-06-30   1    0.5  0.5
2 2015-09-30   1    0.5  0.5
3 2015-12-31   1    2.0  0.5
4 2015-03-31   2    1.5  1.5
5 2015-06-30   2    0.5  0.5
6 2015-09-30   2    0.5  0.5
7 2015-12-31   2    2.0  0.5

Thanks, tingis

Upvotes: 0

Views: 2148

Answers (1)

Jianxun Li
Jianxun Li

Reputation: 24742

Since you only want to change the first element of each group, you can do a customized groupby apply function to do this.

import pandas as pd
import datetime

# your data
# =================================================
table = [[datetime.datetime(2015, 3, 31), 1, 0.5, 1],
         [datetime.datetime(2015, 6, 30), 1, 0.5, 0.5],
         [datetime.datetime(2015, 9, 30), 1, 0.5, 0.5],
         [datetime.datetime(2015, 12, 31), 1, 2, 0.5],
         [datetime.datetime(2015, 3, 31), 2, 0.5, 1.5],
         [datetime.datetime(2015, 6, 30), 2, 0.5, 0.5],
         [datetime.datetime(2015, 9, 30), 2, 0.5, 0.5],
         [datetime.datetime(2015, 12, 31), 2, 2, 0.5]]

df = pd.DataFrame(table, columns=['Date', 'Id', 'Value', 'Old'])

print(df)


        Date  Id  Value  Old
0 2015-03-31   1    0.5  1.0
1 2015-06-30   1    0.5  0.5
2 2015-09-30   1    0.5  0.5
3 2015-12-31   1    2.0  0.5
4 2015-03-31   2    0.5  1.5
5 2015-06-30   2    0.5  0.5
6 2015-09-30   2    0.5  0.5
7 2015-12-31   2    2.0  0.5

# processing
# ====================================
def func(group):
    if group.Value.values[0] < group.Old.values[0]:
        group.Value.values[0] = group.Old.values[0]
    return group

df.groupby('Id').apply(func)

        Date  Id  Value  Old
0 2015-03-31   1    1.0  1.0
1 2015-06-30   1    0.5  0.5
2 2015-09-30   1    0.5  0.5
3 2015-12-31   1    2.0  0.5
4 2015-03-31   2    1.5  1.5
5 2015-06-30   2    0.5  0.5
6 2015-09-30   2    0.5  0.5
7 2015-12-31   2    2.0  0.5

Upvotes: 2

Related Questions