Luis Leal
Luis Leal

Reputation: 3514

How to use f1-score for CrossValidator evaluator in a binary problem(BinaryClassificationEvaluator) in pyspark 2.3

My use case is a common use case: binary classification with unbalanced labels so we decided to use f1-score for hyper-param selection via cross-validation, we are using pyspark 2.3 and pyspark.ml, we create a CrossValidator object but for the evaluator, the issue is the following:

But the problem is: I use a corporate/enterprise spark cluster with no plans to upgrade current version(2.3) so the question is: how can I use f1 score in a CrossValidator evaluator for binary case considering we are restricted to spark 2.3

Upvotes: 4

Views: 2286

Answers (2)

AdibP
AdibP

Reputation: 2939

You can create a class for this. I also had the same problem with my company's spark 2.4 so I tried to make an F1 score evaluator for binary classifications. I had to specify the .evaluate and .isLargerBetter methods for the new class. Here is a sample code when I tried on this dataset :

class F1BinaryEvaluator():

    def __init__(self, predCol="prediction", labelCol="label", metricLabel=1.0):
        self.labelCol = labelCol
        self.predCol = predCol
        self.metricLabel = metricLabel

    def isLargerBetter(self):
        return True

    def evaluate(self, dataframe):
        tp = dataframe.filter(self.labelCol + ' = ' + str(self.metricLabel) + ' and ' + self.predCol + ' = ' + str(self.metricLabel)).count()
        fp = dataframe.filter(self.labelCol + ' != ' + str(self.metricLabel) + ' and ' + self.predCol + ' = ' + str(self.metricLabel)).count()
        fn = dataframe.filter(self.labelCol + ' = ' + str(self.metricLabel) + ' and ' + self.predCol + ' != ' + str(self.metricLabel)).count()
        return tp / (tp + (.5 * (fn +fp)))


f1_evaluator = F1BinaryEvaluator()

from pyspark.ml.tuning import ParamGridBuilder, CrossValidator
from pyspark.ml.classification import GBTClassifier
gbt = GBTClassifier()
paramGrid = (ParamGridBuilder()
             .addGrid(gbt.maxDepth, [3, 5, 7])
             .addGrid(gbt.maxBins, [10, 30])
             .addGrid(gbt.maxIter, [10, 15])
             .build())
cv = CrossValidator(estimator=gbt, estimatorParamMaps=paramGrid, evaluator=f1_evaluator, numFolds=5)

cvModel = cv.fit(train)
cv_pred = cvModel.bestModel.transform(test)

The CV process ran with no problems, though I don't know about the performance. I also tried to compare the evaluator with sklearn.metrics.f1_score and the values are close.

from sklearn.metrics import f1_score
print("made-up F1 Score evaluator : ", f1_evaluator.evaluate(cv_pred))
print("sklearn F1 Score evaluator : ", f1_score(cv_pred.select('label').toPandas(), cv_pred.select('prediction').toPandas()))

made-up F1 Score evaluator :  0.9363636363636364
sklearn F1 Score evaluator :  0.9363636363636363

Upvotes: 1

igrinis
igrinis

Reputation: 13646

If you could use Spark v3.0+, the easiest way would be using F-measure by label metric and specifying the label (and setting beta to 1):

evaluator = MulticlassClassificationEvaluator(metricName='fMeasureByLabel', metricLabel=1, beta=1.0) 

But since you are restricted to v2.3, you can either

  1. reimplement CrossValidator functionality. pyspark.mllib.evaluation.MulticlassMetrics has fMeasure by label method. See the example for reference.

  2. change your metric to areaUnderPR from BinaryClassificationEvaluator, which is sort of "goodness of model" metric, and should do the job for you (re unbalanced labels). This blogpost compares F1 and AUC-PR.

Upvotes: 2

Related Questions