DGMS89
DGMS89

Reputation: 1677

Offsetting the index in pandas dataframe (keep content intact)

Scenario: I have 2 dataframes, one has data for period X and the other has the data for period X-1 (could be day or month).

Data example:

DF1 (time X):

                Item 1       Item 2
2018-01-02       0.431        0.656
2018-01-03       0.439        0.668
2018-01-04       0.447        0.680
2018-01-05       0.455        0.692
2018-01-08       0.479        0.729
2018-01-09       0.487        0.741
2018-01-10       0.495        0.753

DF1 (time X-1):

                Item 1       Item 2
2018-01-02       0.429        0.651
2018-01-03       0.431        0.656
2018-01-04       0.439        0.668
2018-01-05       0.447        0.680
2018-01-08       0.455        0.692
2018-01-09       0.479        0.729
2018-01-10       0.487        0.741

I want my resuling dataframe to be:

X Final:
                Item 1       Item 2
2018-01-01       0.429        0.651
2018-01-02       0.431        0.656
2018-01-03       0.439        0.668
2018-01-04       0.447        0.680
2018-01-05       0.455        0.692
2018-01-08       0.479        0.729
2018-01-09       0.487        0.741
2018-01-10       0.495        0.753

What I already tried: I was initially trying to merge the last column of X array with x-1 array, and offset the index, but in doing that, I also offset the data, which is not what I want:

df_xfinal = df_x1.iloc[:]
df_xlastrow = df_x.iloc[-1:]
df_xfinal = df_xfinal.append(df_xlastrow)
df_xfinal = df_xfinal.shift(periods=-1, axis=0)

Question: Can this be done in a straightforward way, without having through loop across all the array items?

Upvotes: 2

Views: 694

Answers (1)

sacuL
sacuL

Reputation: 51405

If I understand correctly:

# Make sure the indices are datetimes (if they aren't already)
DF1.index = pd.to_datetime(DF1.index)
DF2.index = pd.to_datetime(DF2.index)

# subtract a day from the index of df2
DF2.index = DF2.index - pd.Timedelta('1d')

# concatenate the two dataframes, sort by index, and drop duplicates
final_df = pd.concat([DF1, DF2]).sort_index().drop_duplicates()

Which returns:

>>> final_df
            Item 1  Item 2
2018-01-01   0.429   0.651
2018-01-02   0.431   0.656
2018-01-03   0.439   0.668
2018-01-04   0.447   0.680
2018-01-05   0.455   0.692
2018-01-08   0.479   0.729
2018-01-09   0.487   0.741
2018-01-10   0.495   0.753

Upvotes: 2

Related Questions