Reputation: 3
I am fairly new to using Python, and I am working on a stock analysis script. The idea is that the script will eventually take in a stock symbol, and the script will calculate Sharpe ratio, Treynor ratio, and other financial information. Right now, I am having trouble getting Pandas to work properly. I am unable to access just a column from the DataFrame to calculate yield for a stock.
from pandas.io.data import DataReader
from datetime import date, timedelta
def calc_yield(now, old):
return (now-old)/old
def yield_array(cl):
array = []
count = 0
for i in cl:
old = cl[count]
count += 1
new = cl[count]
array.append(calc_yield(new, old))
return array
market = '^GSPC'
ticker = "AAPL"
days = 10
# set start and end dates
edate = date.today() - timedelta(days=1)
sdate = edate - timedelta(days=days)
# Read the stock price data from Yahoo
data = DataReader(ticker, 'yahoo', start=sdate, end=edate)
close = data['Adj Close']
print yield_array(close)
Error:
/Users/Tim/anaconda/bin/python "/Users/Tim/PycharmProjects/Test2/module tests.py"
Traceback (most recent call last):
File "/Users/Tim/PycharmProjects/Test2/module tests.py", line 35, in <module>
print yield_array(close)
File "/Users/Tim/PycharmProjects/Test2/module tests.py", line 16, in yield_array
new = cl[count]
File "/Users/Tim/anaconda/lib/python2.7/site-packages/pandas/core/series.py", line 484, in __getitem__
result = self.index.get_value(self, key)
File "/Users/Tim/anaconda/lib/python2.7/site-packages/pandas/tseries/index.py", line 1243, in get_value
return _maybe_box(self, Index.get_value(self, series, key), series, key)
File "/Users/Tim/anaconda/lib/python2.7/site-packages/pandas/core/index.py", line 1202, in get_value
return tslib.get_value_box(s, key)
File "tslib.pyx", line 540, in pandas.tslib.get_value_box (pandas/tslib.c:11833)
File "tslib.pyx", line 555, in pandas.tslib.get_value_box (pandas/tslib.c:11680)
IndexError: index out of bounds
Process finished with exit code 1
Upvotes: 0
Views: 1682
Reputation: 10298
I think I see your problem. Given this function:
def yield_array(cl):
array = []
count = 0
for i in cl:
old = cl[count]
count += 1
print count
new = cl[count]
array.append(calc_yield(new, old))
print old
print new
return array
The problem is that on the last item of cl
, you will add 1 to count
, which will result in an index one greater than the maximum index of cl
. This results in an error, because it is trying to access an index that doesn't exist. You would need to do something like for i in cl[:-1]
, which would skip the last element.
However, there is a much simpler way to do this by vectorizing. You can reduce this whole function to:
close = data['Adj Close']
yield_data = close.diff()/close.shift(1)
or better yet, you can put the result back in the DataFrame
for later use:
close = data['Adj Close']
data['Yield'] = close.diff()/close.shift(1)
Upvotes: 1