Reputation: 27
Would you please tell me what is wrong with the following as I get the error:
ValueError: cannot reindex a non-unique index with a method or limit
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import pandas_datareader as web
data= web.get_data_yahoo("BTC-USD",
start = "2015-01-01 ",
end = "2021-01-01 ")
btc_dailly_return= data['Adj Close'].pct_change()
btc_monthly_returns = data['Adj Close'].resample('M').ffill().pct_change()
Upvotes: 0
Views: 1741
Reputation: 86
When you use resample, you have to tell it how you would like to combine all the entries within the timeframe you chose. In your example, you're combining all the values within one month, you could combine them by adding them together, by taking the average, the standard devation, the maximum value, etc. So you have to tell Pandas what you would like to do by providing an additional method:
data['col'].resample('M').sum()
data['col'].resample('M').max()
data['col'].resample('M').mean()
In your case, last()
is probably the most reasonable, so just change your last line to:
btc_monthly_returns = data['Adj Close'].resample('M').last().ffill().pct_change()
As to why the error only pops up with BTC-USD: that particular table has a duplicate date entry, causing ffill()
to throw an error. last()
(or any other reduction type aggregator) doesn't care about the duplicate.
Generally, resample('<method>').ffill()
should be used for upsampling data, i.e. turning a list of months into a list of days. In that case ffill()
would fill all the newly generated timestamps with the value from the previous valid timestamp. Your example downsamples, so a reducing aggregator like last
, sum
, or mean
should be called.
Upvotes: 2