gorjan
gorjan

Reputation: 5555

Pandas: Set a value on a data-frame using loc then iloc

Let's assume that I have the following data-frame:

df = pd.DataFrame(np.zeros((3, 5)), columns=["feature_a", "feature_b", "feature_c", "feature_d", "e"])

    feature_a   feature_b   feature_c   feature_d   e
0      0.0         0.0         0.0         0.0     0.0
1      0.0         0.0         0.0         0.0     0.0
2      0.0         0.0         0.0         0.0     0.0

However, please note that the data-frame that I am dealing with is much larger. Then, what I would like to do is update the value of the middle two feature columns, so that the result would be:

    feature_a   feature_b   feature_c   feature_d   e
0      0.0         0.0         0.0         0.0     0.0
1      0.0         8.0         8.0         0.0     0.0
2      0.0         0.0         0.0         0.0     0.0

What I have tried that I assumed it would work:

feature_columns = df.filter(like="feature").columns.values
df.loc[:,feature_columns].iloc[1,[1, 2]] = 88

For me it's utterly important that this is done following the pattern that I have tried. The reason for that is:

  1. The columns that I wish to update strictly contain a pattern.
  2. Once I have the columns selected I know the row and columns index that I wish to update.

To conclude, my question is how can I go from the starting data-frame to the resulting data-frame, while having a solution that will follow the way of doing things that I have tried.

Upvotes: 0

Views: 65

Answers (1)

tel
tel

Reputation: 13999

This is how you deal with the specific example you show in your question:

import pandas as pd

df = pd.DataFrame(np.zeros((3, 5)), columns=["feature_a", "feature_b", "feature_c", "feature_d", "e"])
feature_columns = df.filter(like="feature").columns.values

sli = df[feature_columns].iloc[1,[1,2]]
df.loc[sli.name, sli.index] = 88
print(df)
# output
#        feature_a  feature_b  feature_c  feature_d    e
#     0        0.0        0.0        0.0        0.0  0.0
#     1        0.0       88.0       88.0        0.0  0.0
#     2        0.0        0.0        0.0        0.0  0.0

Depending on the precise application, you might have to change the exact syntax around a bit, but the idea is broadly applicable: harvest the columns and indices of your selection, then use those to slice df.loc, then assign to that slice. That should get around the "assign to copy" issue you're running into.

Upvotes: 2

Related Questions