Reputation: 35
I'm new in stackoverflow and also in Python. I been trying for a few days to get this to work but I don't quite get what I'm doing wrong, also I have searched a lot for a common problem but with no luck.
The following code reads some sales data from a csv file, then returns a dictionary with two keys (a unique code and the year value) and the sum of the sales of an specific month. I pass two arguments to the function, the code and the month. That part is doing what I want, the problem comes when I want to iterate trough different months, no matter what value (month) I assign to the function it will always return me the first value analyzed
d = defaultdict(int)
spamReader = csv.reader(open('C:\Sales\Sales_CO05.csv', 'rU'))
#Return the sales sum for a given month and code
def sum_data_by_month(ean, m):
for line in spamReader:
tokens = [t for t in line]
if tokens[5] == str(ean):
try:
sid = tokens[5]
dusid = tokens[18]
value = int(str(tokens[m]).replace(',',''))
except ValueError:
continue
d[sid,dusid] += value
return d
#Try to iterate over different month values
j = sum_data_by_month('7702010381089', 6)
f = sum_data_by_month('7702010381089', 7)
m = sum_data_by_month('7702010381089', 8)
a = sum_data_by_month('7702010381089', 9)
This is the result that I'm getting:
defaultdict(<type 'int'>, {('7702010381089', '2013'): 80, ('7702010381089', '2014'): 363})
defaultdict(<type 'int'>, {('7702010381089', '2013'): 80, ('7702010381089', '2014'): 363})
defaultdict(<type 'int'>, {('7702010381089', '2013'): 80, ('7702010381089', '2014'): 363})
defaultdict(<type 'int'>, {('7702010381089', '2013'): 80, ('7702010381089', '2014'): 363})
And this is what I’m expecting:
defaultdict(<type 'int'>, {('7702010381089', '2013'): 80, ('7702010381089', '2014'): 363})
defaultdict(<type 'int'>, {('7702010381089', '2013'): 229, ('7702010381089', '2014'): 299})
etc..
It seems like if the dict is stuck in some kind of memory state that's is not allowing it to get updated, if I run a single instance of the function (i.e j = sum_data_by_month('7702010381089', 8)I I get de desired value.
Any help will be really apreciated.
Thanks!
Upvotes: 2
Views: 207
Reputation: 114098
dictionaries are mutable
j = sum_data_by_month('7702010381089', 6)
j is d # true
does not create a new dictionary ... it just points to the existing dictionary
f = sum_data_by_month('7702010381089', 7) #the dictionary has changed
f is j # true , both point to the same dictionary
f is d # true , both point to d to be specific
you can fix it by
from copy import deepcopy
...
def sum_data_by_month(ean, m):
...
return deepcopy(d) # a new dict no longer just a pointer to the same d
#or maybe even better
return dict(d)
now
j = sum_data_by_month('7702010381089', 6)
j is d # false
Upvotes: 1