baxx
baxx

Reputation: 4705

Replace pandas dataframe values within a particular column based on values within a different column

Given the following dataframe:

x = pd.DataFrame(
    {"a": [1, 2, 3, 2], "b_1": [0, 0, 0, 0], "b_2": [0, 0, 0, 0], "b_3": [0, 0, 0, 0]}
)

Which looks as:

   a  b_1  b_2  b_3
0  1    0    0    0
1  2    0    0    0
2  3    0    0    0
3  2    0    0    0

How can it be coverted to:

y = pd.DataFrame(
    {
        "a": [1, 2, 3, 2],
        "b_1": [-1, 0, 0, 0],
        "b_2": [0, -1, 0, -1],
        "b_3": [0, 0, -1, 0],
    }
)

which looks as:

   a  b_1  b_2  b_3
0  1   -1    0    0
1  2    0   -1    0
2  3    0    0   -1
3  2    0   -1    0

edit 2

Here's a solution:

x1 = x.melt(id_vars="a", ignore_index=False)
x1["value_2"] = x1["variable"].str.split("_").str[1].astype(int)
x1.loc[x1["a"].eq(x1["value_2"]), "value"] = -1
x1 = x1.drop("value_2", axis=1)
x1.set_index(["a", "variable"], append=True)["value"].unstack().reset_index(level=1)
x1 = x1.set_index(["a", "variable"], append=True)["value"].unstack().reset_index(level=1)

I feel as though it's quite messy though.

Upvotes: 2

Views: 55

Answers (1)

Ben.T
Ben.T

Reputation: 29635

you can use pd.get_dummies.

print(pd.get_dummies(x['a']).add_prefix('b_'))
   b_1  b_2  b_3
0    1    0    0
1    0    1    0
2    0    0    1
3    0    1    0

Then you have different options to substract it from x. For example, you can use this way with reindex.

y = x - pd.get_dummies(x['a']).add_prefix('b_').reindex(columns=x.columns, fill_value=0)
print(y)
   a  b_1  b_2  b_3
0  1   -1    0    0
1  2    0   -1    0
2  3    0    0   -1
3  2    0   -1    0

Note that if you don't have the column b_* already in x and want to generate them automatically from the column a, then something like this would work too.

x = pd.DataFrame({"a": [1, 2, 3, 2]})
y = x.sub(pd.get_dummies(x['a']).add_prefix('b_'), fill_value=0)
print(y)

Upvotes: 5

Related Questions