Khrystyna Kosenko
Khrystyna Kosenko

Reputation: 121

how to shift value in dataframe using pandas?

I have data like this, without z1, what i need is to add a column to DataFrame, so it will add column z1 and represent values as in the example, what it should do is to shift z value equally on 1 day before for the same Start date.

enter image description here I was thinking it could be done with apply and lambda in pandas, but i`m not sure how to define lambda function

data = pd.read_csv("....")

data["Z"] = data[[
                "Start", "Z"]].apply(lambda x:

Upvotes: 3

Views: 1123

Answers (1)

jezrael
jezrael

Reputation: 863781

You can use DataFrameGroupBy.shift with merge:

#if not datetime
df['date'] = pd.to_datetime(df.date)
df.set_index('date', inplace=True)
df1 = df.groupby('start')['z'].shift(freq='1D',periods=1).reset_index()
print (pd.merge(df.reset_index(),df1, on=['start','date'], how='left', suffixes=('','1')))

        date  start       z        z1
0 2012-12-01    324  564545       NaN
1 2012-12-01    384    5555       NaN
2 2012-12-01    349     554       NaN
3 2012-12-02    855     635       NaN
4 2012-12-02    324      56  564545.0
5 2012-12-01    341      98       NaN
6 2012-12-03    324     888      56.0

EDIT:

Try find duplicates and fillna by 0:

df['date'] = pd.to_datetime(df.date)
df.set_index('date', inplace=True)
df1 = df.groupby('start')['z'].shift(freq='1D',periods=1).reset_index()
df2 = pd.merge(df.reset_index(),df1, on=['start','date'], how='left', suffixes=('','1'))
mask = df2.start.duplicated(keep=False)
df2.ix[mask, 'z1'] = df2.ix[mask, 'z1'].fillna(0)
print (df2)
        date  start       z        z1
0 2012-12-01    324  564545       0.0
1 2012-12-01    384    5555       NaN
2 2012-12-01    349     554       NaN
3 2012-12-02    855     635       NaN
4 2012-12-02    324      56  564545.0
5 2012-12-01    341      98       NaN
6 2012-12-03    324     888      56.0

Upvotes: 3

Related Questions