taathtwostep
taathtwostep

Reputation: 35

Creating a true or false column based on another column's values

import numpy as np
import pandas as pd

df_summary_sklearn = pd.DataFrame(columns=['Method','Prediction' , 'is_best'])
df_summary_sklearn['Method'] = ['1-KNN', '5-KNN', '10-KNN']
df_summary_sklearn['Prediction'] = [1, 2, 3] # example data
df_summary_sklearn['is_best'] = np.where((df_summary_sklearn['Prediction']).min(), True, False)

print(df_summary_sklearn)

I want to return True in the is_best column for the minimum value of the Prediction column, and False for the rest. So what I want is this:

   Method  Prediction  is_best
0   1-KNN           1     True
1   5-KNN           2    False
2  10-KNN           3    False

But what I'm currently getting is this:

   Method  Prediction  is_best
0   1-KNN           1     True
1   5-KNN           2     True
2  10-KNN           3     True

How do I correctly create this column?

Upvotes: 1

Views: 630

Answers (1)

Nikhil Kumar
Nikhil Kumar

Reputation: 1232

Since you don't have a minimum working example, I've taken contrived values for the Prediction column.

df_summary_sklearn = pd.DataFrame(columns=['Method','Prediction' , 'is_best'])
df_summary_sklearn['Method'] = ['1-KNN', '5-KNN', '10-KNN']
df_summary_sklearn['Prediction'] = [1, 2, 3]

# This returns a boolean array where the condition holds True.
df_summary_sklearn['Prediction'] == df_summary_sklearn['Prediction'].min()

# Set the is_best column to the result of the previous statement.
df_summary_sklearn['is_best'] = df_summary_sklearn['Prediction'] == df_summary_sklearn['Prediction'].min()

This gives the following output.

>>> df_summary_sklearn
   Method  Prediction  is_best
0   1-KNN           1     True
1   5-KNN           2    False
2  10-KNN           3    False

You can replace the [1, 2, 3] in the above example by your own values, say [one_fold, five_fold, ten_fold] in your question.

Upvotes: 3

Related Questions