Reputation: 11171
I have code where a function/method accepts a Series (row from df) and is supposed to modify it in-place, such that changes are reflected in the original df. However, I seem unable to force the modification as a view rather than a copy. Information from the documentation and a related question on Stack Overflow do not resolve the issue as given by the example below:
import pandas as pd
pd.__version__ # 0.24.2
ROW_NAME = "r1"
COL_NAME = "B"
NEW_VAL = 100.0
# df I would like to modify in-place
df = pd.DataFrame({"A":[[1], [2], [3,4]], "B": [1.0, 2.0, 3.0]}, index=["r1", "r2", "r3"])
# a row (Series reference) is the input param to a function that should modify df in-place
record = df.loc[ROW_NAME]
record.loc[COL_NAME] = NEW_VAL
assert df.loc[ROW_NAME, COL_NAME] == NEW_VAL #False
The line starting with record.loc
results in the familiar warning:
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
, which might make sense, except that record
appears to reference df
and can be modified in-place under some circumstances. An example of this:
record = df.loc[ROW_NAME]
record.loc["A"].append(NEW_VALUE)
assert NEW_VALUE in df.loc["r1", "A"] # True
My question is: how can I force a modification the float value at df.loc[ROW_NAME, COL_NAME]
in-place from the Series record
? Bonus points for clarifying why it is possible to modify column A in-place but not column B in the examples above.
Other related questions:
Upvotes: 0
Views: 411
Reputation: 11171
Based on the sources linked in the question and a thorough reading of the documentation, it does not appear possible to enforce returning a view vs copy of a Series generated from a DataFrame row.
As @Lilith Schneider points out, the original confusion over this comes from the fact that record = df.loc["r1"]
returns a shallow copy - some hybrid of a copy and view that may cause confusion and lead to unexpected behavior.
Upvotes: 1
Reputation: 51
I think this behavior is confusing because record
in this case is a shallow copy of your data frame row.
If you refer to this stack post it sounds like .loc[]
is generally expected to return a copy and not a view, and that assignment will not work if the .loc
s have been chained.
I did confirm if you modify the original data frame directly it will work.
df.loc[ROW_NAME, COL_NAME] = NEW_VAL
assert(df.loc[ROW_NAME, COL_NAME] == NEW_VAL) # True
And as for the .append
still working, this is why I mentioned the "shallow" copy behavior. Your new record copy still contains a reference to the original list in column A. See this post for a refresher on the difference between binding to a new object vs mutating an existing object.
Upvotes: 1