Reputation:
I am currently practicing the ropes of WEKA modelling with the free UCI breast cancer .arff
file and from the various posts here I was able to tweak it's accuracy ranging from 63% to 73%. I use WEKA 3.7.10
in a Windows 7 Starter machine.
I used attribute selection to decrease the number of variables using InfoGainAttributeEval
with Ranker
. I chose the topmost five with the following result:
Evaluator: weka.attributeSelection.InfoGainAttributeEval
Search: weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N -1
Relation: breast-cancer
Instances: 286
Attributes: 10
age
menopause
tumor-size
inv-nodes
node-caps
deg-malig
breast
breast-quad
irradiat
Class
Evaluation mode: 10-fold cross-validation
=== Attribute selection 10 fold cross-validation (stratified), seed: 1 ===
average merit average rank attribute
0.078 +- 0.011 1.3 +- 0.64 6 deg-malig
0.071 +- 0.01 1.9 +- 0.3 4 inv-nodes
0.061 +- 0.008 3 +- 0.77 3 tumor-size
0.051 +- 0.007 3.8 +- 0.4 5 node-caps
0.026 +- 0.006 5 +- 0 9 irradiat
0.012 +- 0.003 6.4 +- 0.49 1 age
0.01 +- 0.003 6.6 +- 0.49 8 breast-quad
0.003 +- 0.001 8.5 +- 0.5 7 breast
0.003 +- 0.002 8.5 +- 0.5 2 menopause
After removing the low-ranked variables, I proceeded to create my model. I chose Multilayer Perceptron because it was a required algorithm from the journal I was basing my study on.
The suggestion of Bernhard Pfahringe to use 0.1
for the learning rate
and momentum
and the factors of exponential numbers 1, 2, 4, 8, for the hidden nodes
and epoch
and so on.
After a few tries with the method, I noticed a pattern of using 2 for the hidden layers and a decimal equivalent of a binary number ie. 512, 1024, 2048, ... that results to increasing accuracy. For example, a hidden node
of 2 with epoch
1024 and so on.
I have a varied series of results but the highest one I got so far was with the following (using hidden node
2 and epoch
16384:
Scheme: weka.classifiers.functions.MultilayerPerceptron -L 0.1 -M 0.1 -N 16384 -V 0 -S 0 -E 20 -H 2
Relation: breast-cancer-weka.filters.unsupervised.attribute.Remove-R1-2,7-8
Instances: 286
Attributes: 6
tumor-size
inv-nodes
node-caps
deg-malig
irradiat
Class
Test mode: 10-fold cross-validation
=== Classifier model (full training set) ===
Sigmoid Node 0
Inputs Weights
Threshold -2.4467109489840375
Node 2 2.960926490700117
Node 3 1.5276384018358489
Sigmoid Node 1
Inputs Weights
Threshold 2.446710948984037
Node 2 -2.9609264907001167
Node 3 -1.5276384018358493
Sigmoid Node 2
Inputs Weights
Threshold 0.8594931368555995
Attrib tumor-size=0-4 -0.6809394102558067
Attrib tumor-size=5-9 -0.7999278705976403
Attrib tumor-size=10-14 -0.5139914771540879
Attrib tumor-size=15-19 2.3071396030112834
Attrib tumor-size=20-24 -6.316868254289899
Attrib tumor-size=25-29 5.535754474315768
Attrib tumor-size=30-34 -12.31495416708197
Attrib tumor-size=35-39 2.165860489861981
Attrib tumor-size=40-44 10.740913335424047
Attrib tumor-size=45-49 9.102261927484186
Attrib tumor-size=50-54 -17.072392893550735
Attrib tumor-size=55-59 0.043056333044031
Attrib inv-nodes=0-2 9.578867366884618
Attrib inv-nodes=3-5 1.3248317047328586
Attrib inv-nodes=6-8 -5.081199984305494
Attrib inv-nodes=9-11 -8.604844224457239
Attrib inv-nodes=12-14 2.2330604430275907
Attrib inv-nodes=15-17 -2.8692154868988355
Attrib inv-nodes=18-20 0.04225234708199947
Attrib inv-nodes=21-23 0.017664071511846485
Attrib inv-nodes=24-26 -0.9992481277256989
Attrib inv-nodes=27-29 -0.02737484354173595
Attrib inv-nodes=30-32 -0.04607516719307534
Attrib inv-nodes=33-35 -0.038969156415242706
Attrib inv-nodes=36-39 0.03338452826774849
Attrib node-caps 6.764954936579671
Attrib deg-malig=1 -5.037151186065571
Attrib deg-malig=2 12.469858109768378
Attrib deg-malig=3 -8.382625277311769
Attrib irradiat 8.302010702287868
Sigmoid Node 3
Inputs Weights
Threshold -0.7428771456532647
Attrib tumor-size=0-4 3.5709673152321555
Attrib tumor-size=5-9 3.563713261511895
Attrib tumor-size=10-14 7.86118954430952
Attrib tumor-size=15-19 2.8762105204084167
Attrib tumor-size=20-24 4.60168522637948
Attrib tumor-size=25-29 -5.849391383398816
Attrib tumor-size=30-34 -1.6805815971562046
Attrib tumor-size=35-39 -12.022394228003419
Attrib tumor-size=40-44 11.922229608392747
Attrib tumor-size=45-49 -1.9939414047194557
Attrib tumor-size=50-54 -5.9801974214306215
Attrib tumor-size=55-59 -0.04909236196295539
Attrib inv-nodes=0-2 5.569516359775502
Attrib inv-nodes=3-5 -7.871275549119543
Attrib inv-nodes=6-8 3.405277467966008
Attrib inv-nodes=9-11 -0.3253699778307026
Attrib inv-nodes=12-14 1.244234346055825
Attrib inv-nodes=15-17 1.179311225120216
Attrib inv-nodes=18-20 0.03495291263409073
Attrib inv-nodes=21-23 0.0043299366591334695
Attrib inv-nodes=24-26 0.6595250300030937
Attrib inv-nodes=27-29 -0.02503529326219822
Attrib inv-nodes=30-32 0.041787638417097844
Attrib inv-nodes=33-35 0.008416652090130837
Attrib inv-nodes=36-39 -0.014551878794926747
Attrib node-caps 4.7997880904143955
Attrib deg-malig=1 1.6752746955482163
Attrib deg-malig=2 6.130488722916935
Attrib deg-malig=3 -6.989852429736567
Attrib irradiat 8.716254786514295
Class no-recurrence-events
Input
Node 0
Class recurrence-events
Input
Node 1
Time taken to build model: 27.05 seconds
=== Stratified cross-validation ===
=== Summary ===
Correctly Classified Instances 210 73.4266 %
Incorrectly Classified Instances 76 26.5734 %
Kappa statistic 0.2864
Mean absolute error 0.3312
Root mean squared error 0.4494
Relative absolute error 79.1456 %
Root relative squared error 98.3197 %
Coverage of cases (0.95 level) 98.951 %
Mean rel. region size (0.95 level) 97.7273 %
Total Number of Instances 286
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
0.891 0.635 0.768 0.891 0.825 0.300 0.633 0.748 no-recurrence-events
0.365 0.109 0.585 0.365 0.449 0.300 0.633 0.510 recurrence-events
Weighted Avg. 0.734 0.479 0.714 0.734 0.713 0.300 0.633 0.677
=== Confusion Matrix ===
a b <-- classified as
179 22 | a = no-recurrence-events
54 31 | b = recurrence-events
My question is how can I raise this data's accuracy to at least in the 90% mark? Do I have to do filtering, use another pattern of MLP input parameters?
I plan to have another set of data that I will use after I've learned how to do this (it has around 50 variables and 100,000 instances).
Upvotes: 3
Views: 9524
Reputation: 66805
There is obviously no good answer for such a question, but I will give you some more or less general hints for using MLP:
Upvotes: 6