Reputation: 26037
I'm trying to create a program that tracks stocks I own and adjusts the percent of each stock as the values change.
The problem I'm having is the design/algorithm of tracking the percentage itself as the portfolio changes(The two restrictions I have is I only use percentages and there's days I don't buy or sell anything but those days must be accounted for. So the stocks remains the same on those days but the percent changes because the underlying values of the stocks are changing. If I own 50% IBM and 50% Google, and google's stock goes up and IBM's goes down then google weighting has increased and IBM's has gone down). I feel I started with this simple goal but I keep over complicating it as I get to different stages, so here's my approach but I'm totally willing to gut any or all parts that's inefficient.
I start by creating two dictionaries. holding_stock
and holding_stock_weight
, the keys for each are dates and the values are lists of either the stocks or their weights. For example:
holding_stock[2012-07-17] = ["RY", "IBM"]
holding_stock_weight[2012-07-17] = [50.0, 50.0]
holding_stock[2012-07-18] = ["GS", "A", "IBM"]
holding_stock_weight[2012-07-18] = [20.0,30.0,50.0]
holding_stock[2012-07-23] = ["GS", "A", "IBM", "BSE"]
holding_stock_weight[2012-07-23] = [10.0,35,40,3]
and all the stock information is a returned a list of objects(sorted by date). The idea I had was to start from the earliest date(in this example, 2012-07-17) and then work my way up to the current date while adjusting the percentages(I thought it was simplest approach). I can isolate each different day and unless the date exists as a key in my holding_stock
dict I continue to use the same values. Then finally at the end of each day I wanted to take the return value for each stock and divide it against the daily total to update the percentage.
But there's a problem with my design, for many days I None
as the daily return and my numbers look very odd.
Here's what I have done so far(I tried to comment it as best I could to explain my logic and I'm sorry in advance for how ugly it looks. I think my simple approach has become over complex and probably far from the best solution now)
current_holding_date = fund_date[0] #this is used to track which version of fund is current
current_date = fund_date[0] #this variable used to track date changes
current_holdings = holding_stock[current_holding_date] #list of current stocks being held
current_weight = holding_stock_weight[current_holding_date] #list of current weights of stock
current_return = [None]*len(current_holdings) #create a blank lists of the number of items we need so we can put in returns here
total = 0 #init variable to keep overall total
today_total = 0 #init variable to keep track of daily total
for x in stock_returns: #stock_returns is a list of objects that contains 3 items per object, x.ticker=stockname, x.date=date, x.close=price
if x.date != current_date: #this checks if the day has changed.
#this is where i planned to revalue the new percentages
print 'new date', x.date
#print 'todays total is ', today_total
#get new percentages
for item in range(len(current_holdings)):
#TODO: this skips the first and last item
pass
print current_holdings[item], ' - ', current_weight[item], current_return[item]
today_total = 0
current_date = x.date
if x.date in fund_date: #fund_date is just a list of all the days we can pick from
current_holding_date = x.date #change current date to the new day if it exists in the database
current_holdings = holding_stock[current_holding_date] #replace new holdings
current_weight = holding_stock_weight[current_holding_date] #replace new weights
current_return = [None]*len(current_holdings) #recreate area to hold results
#print 'we just hit a date'
if x.ticker in current_holdings: #go through each object(its already sorted by date) and if the stock exists in the current list then calculate the return
location = current_holdings.index(x.ticker)
today_total = today_total + ((current_weight[location] * 0.01) * x.close)
current_return[location] = ((current_weight[location] * 0.01) * x.close)
#print x.date, ': ', x.ticker, current_weight[location] * 0.01 ,' * ', x.close, ' = ', current_return[location], ' total = ', total
#print current_holding_date
#print x.ticker, x.date, x.close
Here's the result when I run the above code:
2012-07-17 [u'RY', u'IBM'] [50.0, 50.0]
2012-07-18 [u'GS', u'A', u'IBM'] [20.0, 30.0, 50.0]
2012-07-23 [u'GS', u'A', u'IBM', u'BSE'] [10.0, 35.0, 40.0, 3.0]
new date 2012-07-18
RY - 50.0 25.865
IBM - 50.0 None
new date 2012-07-19
GS - 20.0 None
A - 30.0 None
IBM - 50.0 None
new date 2012-07-20
GS - 20.0 19.0
A - 30.0 11.538
IBM - 50.0 97.67
new date 2012-07-23
GS - 20.0 18.832
A - 30.0 11.268
IBM - 50.0 96.225
new date 2012-07-24
GS - 10.0 None
A - 35.0 None
IBM - 40.0 None
BSE - 3.0 None
I suspect it has to do with the way I'm assigning and changing the current_holdings, current_weight and current_return lists but I can't see my error. For some reason I only get data on 2012-07-20 and not on the other days.
Upvotes: 0
Views: 130
Reputation: 3716
stock_returns
' if they contain all tickers you need.current_return = [None]*len(current_holdings)
first, and only afterwards you're doing:
print current_holdings[item], ' - ', current_weight[item], current_return[item]
So even if your code computed current_return
correctly, you're resetting this variable before printing it.
EDIT:
You're using list.index() to find an index where particular ticker is on your current_holdings
list. If current_holdings
contains duplicates, such call will always return first occurrence of x.ticker
(see Python docs about Sequence Types)
location = current_holdings.index(x.ticker)
today_total = today_total + ((current_weight[location] * 0.01) * x.close)
current_return[location] = ((current_weight[location] * 0.01) * x.close)
Upvotes: 1
Reputation: 43517
Parallel lists are often a code smell and this one makes me think that your data representation is causing you to think wrongly about the problem.
You've got data which should represent real events. A tuple is far better for keeping that data related:
('buy', 'goog', '2012-07-26', 10000, 600.00)
named tuples even help make your code more readable (with fewer comments ;) The second problem is that you are trying to store derived data (what am I holding) as the primary information. This has got to be a source of confusion.
If I were to tell you that today was 5 degrees hotter than yesterday, which was 2 degrees colder than the day before, and 10 degrees hotter than it was on Monday which was 18, what's the temperature now? By modelling what happens between events I think you are buying confusion.
Upvotes: 1