nickm
nickm

Reputation: 133

Making a new column in pandas based on conditions of other columns

I would like to make a new column based on an if statement that has conditionals of two or more other columns in a dataframe.

For example, column3 = True if (column1 < 10.0) and (column2 > 0.0).

I have looked around and it seems that other have used the apply method with a lambda function, but i am a bit of a novice on these.

I suppose i could make two additional columns that makes that row a 1 if the condition is met for each column, then sum the columns to check if all conditions are met, but this seems a bit inelegant.

If you provide an answer with apply/lambda, let's suppose the dataframe is called sample_df and the columns are col1, col2, and col3.

Thanks so much!

Upvotes: 4

Views: 10481

Answers (1)

pansen
pansen

Reputation: 6663

You can use eval here for short:

# create some dummy data
df = pd.DataFrame(np.random.randint(0, 10, size=(5, 2)), 
                  columns=["col1", "col2"])
print(df)

    col1    col2
0   1       7
1   2       3
2   4       6
3   2       5
4   5       4

df["col3"] = df.eval("col1 < 5 and col2 > 5")
print(df)

    col1    col2    col3
0   1       7       True
1   2       3       False
2   4       6       True
3   2       5       False
4   5       4       False

You can also write it without eval via (df["col1"] < 5) & (df["col2"] > 5).

You may also enhance the example with np.where to explicitly set the values for the positive and negative cases right away:

df["col4"] = np.where(df.eval("col1 < 5 and col2 > 5"), "Positive Value", "Negative Value")
print(df)

    col1    col2    col3    col4
0   1       7       True    Positive Value
1   2       3       False   Negative Value
2   4       6       True    Positive Value
3   2       5       False   Negative Value
4   5       4       False   Negative Value

Upvotes: 2

Related Questions