Reputation: 3514
My use case is a common use case: binary classification with unbalanced labels so we decided to use f1-score for hyper-param selection via cross-validation, we are using pyspark 2.3 and pyspark.ml, we create a CrossValidator object but for the evaluator, the issue is the following:
But the problem is: I use a corporate/enterprise spark cluster with no plans to upgrade current version(2.3) so the question is: how can I use f1 score in a CrossValidator evaluator for binary case considering we are restricted to spark 2.3
Upvotes: 4
Views: 2286
Reputation: 2939
You can create a class for this. I also had the same problem with my company's spark 2.4 so I tried to make an F1 score evaluator for binary classifications. I had to specify the .evaluate
and .isLargerBetter
methods for the new class. Here is a sample code when I tried on this dataset :
class F1BinaryEvaluator():
def __init__(self, predCol="prediction", labelCol="label", metricLabel=1.0):
self.labelCol = labelCol
self.predCol = predCol
self.metricLabel = metricLabel
def isLargerBetter(self):
return True
def evaluate(self, dataframe):
tp = dataframe.filter(self.labelCol + ' = ' + str(self.metricLabel) + ' and ' + self.predCol + ' = ' + str(self.metricLabel)).count()
fp = dataframe.filter(self.labelCol + ' != ' + str(self.metricLabel) + ' and ' + self.predCol + ' = ' + str(self.metricLabel)).count()
fn = dataframe.filter(self.labelCol + ' = ' + str(self.metricLabel) + ' and ' + self.predCol + ' != ' + str(self.metricLabel)).count()
return tp / (tp + (.5 * (fn +fp)))
f1_evaluator = F1BinaryEvaluator()
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator
from pyspark.ml.classification import GBTClassifier
gbt = GBTClassifier()
paramGrid = (ParamGridBuilder()
.addGrid(gbt.maxDepth, [3, 5, 7])
.addGrid(gbt.maxBins, [10, 30])
.addGrid(gbt.maxIter, [10, 15])
.build())
cv = CrossValidator(estimator=gbt, estimatorParamMaps=paramGrid, evaluator=f1_evaluator, numFolds=5)
cvModel = cv.fit(train)
cv_pred = cvModel.bestModel.transform(test)
The CV process ran with no problems, though I don't know about the performance. I also tried to compare the evaluator with sklearn.metrics.f1_score
and the values are close.
from sklearn.metrics import f1_score
print("made-up F1 Score evaluator : ", f1_evaluator.evaluate(cv_pred))
print("sklearn F1 Score evaluator : ", f1_score(cv_pred.select('label').toPandas(), cv_pred.select('prediction').toPandas()))
made-up F1 Score evaluator : 0.9363636363636364
sklearn F1 Score evaluator : 0.9363636363636363
Upvotes: 1
Reputation: 13646
If you could use Spark v3.0+, the easiest way would be using F-measure by label
metric and specifying the label (and setting beta to 1):
evaluator = MulticlassClassificationEvaluator(metricName='fMeasureByLabel', metricLabel=1, beta=1.0)
But since you are restricted to v2.3, you can either
reimplement CrossValidator functionality. pyspark.mllib.evaluation.MulticlassMetrics
has fMeasure
by label method. See the example for reference.
change your metric to areaUnderPR
from BinaryClassificationEvaluator
, which is sort of "goodness of model" metric, and should do the job for you (re unbalanced labels). This blogpost compares F1 and AUC-PR.
Upvotes: 2