Reputation: 8108
I am doing some calculations in pandas and the .loc method is having unexpected results. not sure if it is me misusing the syntax or a bug.
df = pd.DataFrame(index=['series1', 'series2', 'series3'])
df['prev value/unit'] = [99,99,99]
df['value'] = [100,100,100]
df['units'] = [100,100,0]
df['value/unit'] = df['value']/df['units']
creates a dataframe where there will be some div by zero values as shown below. Business logic dictates that if there is a /0 the prior value/unit should be used.
prev value/unit value units value/unit
series1 99 100 100 1.000000
series2 99 100 100 1.000000
series3 99 100 0 inf
so adding:
df.loc[df.units == 0, 'value/unit'] = df['prev value/unit']
has the desired effect and the inf above gets correctly overwritten by 99 (the previous per unit value).
However if there are no div/0.
df.loc[df.units == 0, 'value/unit']
#is a empty Series
#Series([], name: value/unit, dtype: float64)
and asigning df['prev value/unit']
to it overwrites all the values!!!!
so e.g.
df = pd.DataFrame(index=['series1', 'series2', 'series3'])
df['prev value/unit'] = [99,99,99]
df['value'] = [100,100,100]
df['units'] = [100,100,100]
df['value/unit'] = df['value']/df['units']
df.loc[df.units == 0, 'value/unit'] = df['prev value/unit']
gives:
prev value/unit value units value/unit
series1 99 100 100 99
series2 99 100 100 99
series3 99 100 100 99
which is totally unexpected. Did I accidentally misuse the .loc
syntax or is this a bug? I am specifically using the it to avoid assigning to temporary views of the dataframe. for reference I am using pandas 0.13.1
Upvotes: 4
Views: 10331
Reputation: 52236
I'm assuming it has something to do with views/copies, but it certainly seems like unexpected behavior - you might open an issue on github.
https://github.com/pydata/pandas/issues
An alternative way to write the code would be using numpy.where, e.g.
In [86]: import numpy as np
In [87]: df['value/unit'] = np.where(df['units'] == 0, df['prev value/unit'], df['value']/df['units'])
In [88]: df
Out[87]:
prev value/unit value units value/unit
series1 99 100 100 1
series2 99 100 100 1
series3 99 100 100 1
Upvotes: 5