Viktor.w
Viktor.w

Reputation: 2317

Resample error: ValueError: cannot reindex a non-unique index with a method or limit

My dataframe looks like this:

    timestamp
2020-03-01 01:11:42.520      -674.0
2020-03-01 02:00:48.778      -700.0
2020-03-01 02:00:58.850      -702.0
2020-03-01 11:45:23.741     -1249.0
2020-03-02 01:56:51.021     -1229.0
2020-03-02 01:56:51.021      -917.0
2020-03-02 01:56:51.021      -837.0

What I try to do is the following:

cum = (orders[['cum']]
        .resample('1S')
        .bfill()
        .fillna('ffill')
      )

But then I have the title error message, any idea what it means? Thanks for the help!

Upvotes: 1

Views: 3103

Answers (1)

jezrael
jezrael

Reputation: 863511

One idea is filter first duplicated index values for resample like your solution and then filter dulicated to added Series, change index by floor and add to original with sorting:

print (orders)
                            cum
timestamp                      
2020-03-01 01:11:42.520  -674.0
2020-03-01 02:00:48.778  -700.0
2020-03-01 02:00:58.850  -702.0
2020-03-01 11:45:23.741 -1249.0
2020-03-02 01:56:51.021 -1229.0
2020-03-02 01:56:51.021  -917.0
2020-03-02 01:56:51.021  -837.0
2020-03-02 01:56:54.021   -67.0

mask = orders.index.duplicated()
cum = (orders.loc[~mask, 'cum']
        .resample('1S')
        .bfill()
        .ffill()
      )
added = orders.loc[mask, 'cum']
added.index = added.index.floor('S')
cum = added.append(cum).sort_index()
print (cum.tail(10))
timestamp
2020-03-02 01:56:47   -1229.0
2020-03-02 01:56:48   -1229.0
2020-03-02 01:56:49   -1229.0
2020-03-02 01:56:50   -1229.0
2020-03-02 01:56:51   -1229.0
2020-03-02 01:56:51    -917.0
2020-03-02 01:56:51    -837.0
2020-03-02 01:56:52     -67.0
2020-03-02 01:56:53     -67.0
2020-03-02 01:56:54     -67.0
Name: cum, dtype: float64

Upvotes: 1

Related Questions