anon01
anon01

Reputation: 11171

How can I modify a dataframe element from a Series defined through df.loc[row]?

I have code where a function/method accepts a Series (row from df) and is supposed to modify it in-place, such that changes are reflected in the original df. However, I seem unable to force the modification as a view rather than a copy. Information from the documentation and a related question on Stack Overflow do not resolve the issue as given by the example below:

import pandas as pd
pd.__version__ # 0.24.2

ROW_NAME = "r1"
COL_NAME = "B"
NEW_VAL = 100.0

# df I would like to modify in-place
df = pd.DataFrame({"A":[[1], [2], [3,4]], "B": [1.0, 2.0, 3.0]}, index=["r1", "r2", "r3"])

# a row (Series reference) is the input param to a function that should modify df in-place
record = df.loc[ROW_NAME]
record.loc[COL_NAME] = NEW_VAL
assert df.loc[ROW_NAME, COL_NAME] == NEW_VAL #False

The line starting with record.loc results in the familiar warning: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame, which might make sense, except that record appears to reference df and can be modified in-place under some circumstances. An example of this:

record = df.loc[ROW_NAME]
record.loc["A"].append(NEW_VALUE)
assert NEW_VALUE in df.loc["r1", "A"] # True

My question is: how can I force a modification the float value at df.loc[ROW_NAME, COL_NAME] in-place from the Series record? Bonus points for clarifying why it is possible to modify column A in-place but not column B in the examples above.

Other related questions:

Upvotes: 0

Views: 411

Answers (2)

anon01
anon01

Reputation: 11171

Based on the sources linked in the question and a thorough reading of the documentation, it does not appear possible to enforce returning a view vs copy of a Series generated from a DataFrame row.

As @Lilith Schneider points out, the original confusion over this comes from the fact that record = df.loc["r1"] returns a shallow copy - some hybrid of a copy and view that may cause confusion and lead to unexpected behavior.

Upvotes: 1

Lilith Schneider
Lilith Schneider

Reputation: 51

I think this behavior is confusing because record in this case is a shallow copy of your data frame row.

If you refer to this stack post it sounds like .loc[] is generally expected to return a copy and not a view, and that assignment will not work if the .locs have been chained.

I did confirm if you modify the original data frame directly it will work.

df.loc[ROW_NAME, COL_NAME] = NEW_VAL
assert(df.loc[ROW_NAME, COL_NAME] == NEW_VAL) # True

And as for the .append still working, this is why I mentioned the "shallow" copy behavior. Your new record copy still contains a reference to the original list in column A. See this post for a refresher on the difference between binding to a new object vs mutating an existing object.

Upvotes: 1

Related Questions