Reputation: 844
Trying to run the following code to create a new column 'Median Rank':
N=data2.Rank.count()
for i in data2.Rank:
data2['Median_Rank']=i-0.3/(N+0.4)
But I'm getting a constant value of 0.99802. Even though my rank column is as follows:
data2.Rank.head()
Out[464]:
4131 1.0
4173 3.0
4172 3.0
4132 3.0
5335 10.0
4171 10.0
4159 10.0
5079 10.0
4115 10.0
4179 10.0
4180 10.0
4147 10.0
4181 10.0
4175 10.0
4170 10.0
4116 24.0
4129 24.0
4156 24.0
4153 24.0
4160 24.0
5358 24.0
4152 24.0
Somebody please point out the errors in my code.
Upvotes: 0
Views: 336
Reputation: 657
This occurs because every time you make data2['Median_Rank']=i-0.3/(N+0.4)
you are updating the entire column with the value calculated by the expression, the easiest way to do that actually don't need a loop:
N=data2.Rank.count()
data2['Median_Rank'] = data2.Rank-0.3/(N+0.4)
It is possible because pandas supports element-wise operations with series.
if you still want to use for
loop, you will need to use .at
and iterate by rows as follow:
for i, el in zip(df_filt.index,df_filt.rendimento_liquido.values):
df_filt.at[i,'Median_Rank']=el-0.3/(N+0.4)
Upvotes: 1
Reputation: 164663
Your code isn't vectorised. Use this:
N = data2.Rank.count()
data2['Median_Rank'] = data2['Rank'] - 0.3 / (N+0.4)
The reason your code does not work is because you are assigning the entire column in each loop. So only the last i
iteration sticks, values in data2['Median_Rank']
are guaranteed to be identical.
Upvotes: 1