Reputation: 19
TensorFlow: 1.6.0
TensorBoard: 1.6.0
Estimator
tf.estimator.DNNClassifier to train a binary classification model with a largely skewed dataset
(namely imbalanced dataset).Precision-Recall
curve to chooses an optimal model instead of AUC
curve.tf.estimator.DNNClassifier
(Of course, i did changed these three parameters:hidden_units
, feature_columns
, model_dir
).Step 4
, every time i picked out a feature i got a new training result and a new pictures about auc_precision_recall
curve from TensorBoard.
Namely, When i picked out FEATURE_A
i got figure A
, picked out FEATURE_B
i got figure B
,and picked out FEATURE_C
i got figure C
.auc_precision_recall
curve figures:
x
axes: indicate training step.y
axes: range from 0 to 1 (this is what i want to know: what does y
mean?).Precision-Recall
curve from this site.(I paste it here just for us to discuss my problem easily).Precision-Recall
curve:
x
axes: Recall, range from 0 to 1.y
axes: Precision, range from 0 to 1.y
axes in a TensorBoard auc_precision_recall
curve?auc_precision_recall
curve and a standard Precision-Recall
curve?y
axes in a TensorBoard auc_precision_recall
curve so strange?
figure A
, the first point is (x, y) = (1, 0.5009)
, why y
is 0.5009
even in the 1st Step
? and also why most of the other values also keeps in 0.5(from figure A
we can easily read about this)?figure B
, the first point is (x, y) = (7, 0.4625)
, why this y
(0.4625) value is not equal to a value near 0 even in the first a few training steps as figure C
shows?Upvotes: 0
Views: 1938
Reputation: 1
To answer questions 1 and 2. AUC means Area Under the Curve. Therefore, you are looking at Area under the Precision-Recall (PR) Curve. The y-axis gives you this area, which is between 0 and 1 because these are min and max areas achievable on a PR curve.
Upvotes: 0
Reputation: 19
I've got the answer: this is a bug in the tensorflow version 1.6.0 caused by the wrong way(trapezoidal) to calculate the value of AUC_PR
, and this bug has fixed in the latest version 1.8.0 by this commit. So if you are training a largely skewed dataset, remember to update tensorflow to the latest version 1.8.0.
Upvotes: 0