john
john

Reputation: 27

the ValueError: cannot reindex a non-unique index with a method or limit appears for just cryptocurrency

Would you please tell me what is wrong with the following as I get the error:

ValueError: cannot reindex a non-unique index with a method or limit

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import pandas_datareader as web



data= web.get_data_yahoo("BTC-USD",
                        start = "2015-01-01 ",
                        end = "2021-01-01   ")



btc_dailly_return= data['Adj Close'].pct_change()
btc_monthly_returns = data['Adj Close'].resample('M').ffill().pct_change()

Upvotes: 0

Views: 1741

Answers (1)

pvandyken
pvandyken

Reputation: 86

When you use resample, you have to tell it how you would like to combine all the entries within the timeframe you chose. In your example, you're combining all the values within one month, you could combine them by adding them together, by taking the average, the standard devation, the maximum value, etc. So you have to tell Pandas what you would like to do by providing an additional method:

data['col'].resample('M').sum()
data['col'].resample('M').max()
data['col'].resample('M').mean()

In your case, last() is probably the most reasonable, so just change your last line to:

btc_monthly_returns = data['Adj Close'].resample('M').last().ffill().pct_change()

As to why the error only pops up with BTC-USD: that particular table has a duplicate date entry, causing ffill() to throw an error. last() (or any other reduction type aggregator) doesn't care about the duplicate.

Generally, resample('<method>').ffill() should be used for upsampling data, i.e. turning a list of months into a list of days. In that case ffill() would fill all the newly generated timestamps with the value from the previous valid timestamp. Your example downsamples, so a reducing aggregator like last, sum, or mean should be called.

Upvotes: 2

Related Questions