Reputation: 137
I wrote the following code in which I create a dictionary of pandas
dataframes:
import pandas as pd
import numpy as np
classification = pd.read_csv('classification.csv')
thresholdRange = np.arange(0, 70, 0.5).tolist()
classificationDict = {}
for t in thresholdRange:
classificationDict[t] = classification
for k, v in classificationDict.iteritems():
v ['Threshold'] = k
In this case, I want to create a column called 'Threshold' in all the pandas dataframes in which the keys of the dictionary are the values. However, what I get with the code above is the same value in all dataframes. What am I missing here? Perhaps I am complicating things for myself with this approach, but I'd greatly appreciate your help.
Upvotes: 0
Views: 851
Reputation: 2114
Sorry, I got your question wrong. Now this is the issue:
Obviously, classification (a pandas dataframe, I suppose) is a mutable object, and adding a mutable object to a list or a dict makes strange (for python-beginners) behaviour. The same object is added. If you change one of the list entries, all get changed. Try this:
a = [1]
b = [a, a]
b[0] = 2
print(b[1])
This is what happens to your dict.
You have to add different objects to the dict. Probably the dataframe has a .copy()
-method to do this. Alternatively, I found this post for you, with (in essence) the same problem, there are further solutions there:
https://stackoverflow.com/a/2612815/6053327
Upvotes: 1
Reputation: 2114
Of course you get the same value. You are doing the same assignment over and over again in
for k, v in classificationDict.iteritems():
because your v
s are all identical, you assigned them in the first for
Did you try debugging yourself, and print classification
? I assume that it is only the first line?
Upvotes: 0