Reputation: 21
Have a problem with building a model using nnet package. If I understood right the SIZE parameter is the number of neurons in the hidden layer. I used size=2 or 1, but this gives me bad results. I try 3 and more and this gives me an error:
Error in (n1 + 1L):n2 : NA/NaN-agument
Warning
In nominalTrainWorkflow(dat = trainData, info = trainInfo, method = method, :
There were missing values in resampled performance measures.
net <- train(Y~., data=af,
method = 'nnet',
preProcess = NULL,
trControl = fitControl,
verbose = FALSE,
tuneGrid=data.frame(.size=2, .decay=5e-3),
MaxNWts=8000,
linout =T,
maxit=1000
)
So what am I doing wrong, how can I increase the size parameter without having error? I have a data with 338 observations and 2830 variables.
Thank you in advance
Upvotes: 2
Views: 3698
Reputation: 23200
download.file(url= "https://cloud.mail.ru/public/6e188e2baa1f%2Fdata.RData",
destfile = "data.RData")
load("C:\\Users\\jmiller\\Downloads\\data.RData")
require(nnet)
require(caret)
set.seed(1)
set.seed(123)
seeds <- vector(mode = "list", length = 51)
for(i in 1:50) seeds[[i]] <- sample.int(1000, 22)
fitControl <- trainControl(method = "adaptive_cv",
repeats = 5,
verboseIter = TRUE,
seeds = seeds)
net <- train(Y~., data=af,
method = 'nnet',
preProcess = NULL,
trControl = fitControl,
verbose = FALSE,
tuneGrid=data.frame(.size=3, .decay=5e-3),
MaxNWts=4000,
linout =T,
maxit=1000,
na.action = "na.omit"
)
At the root of your problem are degrees of freedom.
Your data object, af
, has these dimensions:
> dim(af)
[1] 338 2830
and you're regressing all 2,829
columns (i.e. 2,830 - 1) on the 338 observations of Y, which is not valid due to insufficient degrees of freedom.
Your model fails with two messages:
model fit failed...Error in nnet.default(x, y, w, ...) : too many (8494) weights
There were missing values in resampled performance measures.
Like you, at first I focused on #2. I tried adding the na.action = "na.omit"
) parameter to caret
's train()
function, which is a good idea but which will not solve this problem.
Then I realized that the first message was more to the point. It's telling you the same thing -- that the number of weights, which is based on the number of explanatory variables, is just way too high.
So I took what you have and reduced the number of regressors (columns used) and now it works fine: X <- af[,1:10]
> net <- train(y=af$Y, x=X,
+ method = 'nnet',
+ preProcess = NULL,
+ trControl = fitControl,
+ verbose = FALSE,
+ tuneGrid=data.frame(.size=3, .decay=5e-3),
+ MaxNWts=4000,
+ linout =T,
+ maxit=1000,
+ na.action = "na.omit"
+ )
# weights: 37
initial value 11680.837531
iter 10 value 567.695784
iter 20 value 536.975033
iter 30 value 531.229395
iter 40 value 529.138440
iter 50 value 528.186846
iter 60 value 527.256175
iter 70 value 526.785917
iter 80 value 524.733350
iter 90 value 523.327685
iter 100 value 523.153812
iter 110 value 522.866651
iter 120 value 522.520324
iter 130 value 519.557709
iter 140 value 519.034003
iter 150 value 518.865745
iter 160 value 518.817728
iter 170 value 518.782720
iter 180 value 518.733853
iter 190 value 518.676881
iter 200 value 518.664181
iter 210 value 518.657738
iter 220 value 518.656348
iter 230 value 518.641653
iter 240 value 518.634891
iter 250 value 518.633536
iter 260 value 518.632945
iter 270 value 518.632559
iter 280 value 518.632273
final value 518.632207
converged
# weights: 37
initial value 17038.362932
iter 10 value 501.607084
iter 20 value 492.283099
iter 30 value 491.783153
iter 40 value 490.882435
iter 50 value 490.509487
iter 60 value 490.347638
iter 70 value 490.276362
iter 80 value 490.135871
iter 90 value 490.046862
iter 100 value 490.042117
iter 110 value 490.040552
iter 120 value 490.038263
iter 130 value 490.033484
final value 490.031813
converged
# weights: 37
initial value 14383.923273
iter 10 value 689.694268
iter 20 value 632.520128
iter 30 value 506.806825
iter 40 value 497.204789
iter 50 value 496.058566
iter 60 value 495.153170
iter 70 value 493.646931
iter 80 value 492.361818
iter 90 value 492.088474
iter 100 value 492.008913
iter 110 value 491.498147
iter 120 value 491.345332
iter 130 value 491.306452
iter 140 value 491.290815
iter 150 value 491.278200
iter 160 value 491.244056
iter 170 value 491.232009
iter 180 value 491.226179
iter 190 value 491.212958
iter 200 value 491.209201
iter 210 value 491.208694
final value 491.208581
converged
# weights: 37
initial value 11084.289612
iter 10 value 622.044592
iter 20 value 520.053760
iter 30 value 507.456841
iter 40 value 502.000020
iter 50 value 497.723332
iter 60 value 495.164695
iter 70 value 494.157722
iter 80 value 492.678614
iter 90 value 491.936464
iter 100 value 491.706778
iter 110 value 491.360232
iter 120 value 491.141080
iter 130 value 490.920068
iter 140 value 490.714486
iter 150 value 490.578641
iter 160 value 490.542580
iter 170 value 490.524909
iter 180 value 490.486804
iter 190 value 490.338614
iter 200 value 490.110047
iter 210 value 489.915521
iter 220 value 489.812929
iter 230 value 489.756234
iter 240 value 489.718024
iter 250 value 489.711636
iter 260 value 489.702732
iter 270 value 489.661668
iter 280 value 489.644434
iter 290 value 489.633946
iter 300 value 489.625405
iter 310 value 489.593120
iter 320 value 489.572627
iter 330 value 489.569740
iter 340 value 489.528552
iter 350 value 489.511607
iter 360 value 489.508993
final value 489.508289
converged
# weights: 37
initial value 16479.112645
iter 10 value 615.081555
iter 20 value 501.104433
iter 30 value 481.204462
iter 40 value 479.305649
iter 50 value 477.681071
iter 60 value 476.668767
iter 70 value 476.010448
iter 80 value 475.480687
iter 90 value 474.542118
iter 100 value 473.701240
iter 110 value 473.345255
iter 120 value 473.099798
iter 130 value 472.974764
iter 140 value 472.849499
iter 150 value 472.685356
iter 160 value 472.521386
iter 170 value 472.438417
iter 180 value 472.294666
iter 190 value 472.250269
iter 200 value 472.231414
iter 210 value 472.226611
iter 220 value 472.222344
iter 230 value 472.213415
iter 240 value 472.208953
iter 250 value 472.204120
iter 260 value 472.199380
iter 270 value 472.196929
iter 280 value 472.195935
iter 290 value 472.195052
iter 300 value 472.194005
iter 310 value 472.193293
final value 472.193152
converged
# weights: 37
initial value 16669.126242
iter 10 value 888.977047
iter 20 value 637.124939
iter 30 value 624.802902
iter 40 value 622.441515
iter 50 value 618.396731
iter 60 value 618.219093
iter 70 value 617.815129
iter 80 value 617.295605
iter 90 value 616.955292
iter 100 value 616.817402
iter 110 value 616.766023
iter 120 value 616.749874
iter 130 value 616.745140
iter 140 value 616.724481
iter 150 value 616.706785
iter 160 value 616.696147
iter 170 value 616.516945
iter 180 value 616.288825
iter 190 value 615.986669
iter 200 value 615.853741
iter 210 value 615.759414
iter 220 value 615.704034
iter 230 value 615.669519
iter 240 value 615.642135
iter 250 value 615.626347
iter 260 value 615.548658
iter 270 value 615.526684
iter 280 value 615.509073
iter 290 value 615.494325
iter 300 value 615.481876
iter 310 value 615.462706
iter 320 value 615.459488
iter 330 value 615.450263
iter 340 value 615.438322
iter 350 value 615.430339
iter 360 value 615.423634
iter 370 value 615.418235
iter 380 value 615.391565
iter 390 value 615.383094
iter 400 value 615.379039
iter 410 value 615.367349
iter 420 value 615.361426
iter 430 value 615.358651
final value 615.358194
converged
>
Note that I chose the columns to be used in x
arbitrarily. You can (and should) chose the columns based on their performance, unless you have some theoretical framework to guide your choices. For instance, you could select the columns with stepwise or lasso regression methods or by manually running various training models and comparing their performance with an ROC curve (available in ROCR
).
Upvotes: 3