Neural network using R "nnet" package- NAs when using SIZE >2

Have a problem with building a model using nnet package. If I understood right the SIZE parameter is the number of neurons in the hidden layer. I used size=2 or 1, but this gives me bad results. I try 3 and more and this gives me an error:

   Error in (n1 + 1L):n2 : NA/NaN-agument
    Warning
    In nominalTrainWorkflow(dat = trainData, info = trainInfo, method = method,  :
    There were missing values in resampled performance measures.



net <- train(Y~., data=af,
              method = 'nnet',
              preProcess = NULL,
              trControl = fitControl,
              verbose = FALSE,
              tuneGrid=data.frame(.size=2, .decay=5e-3),
              MaxNWts=8000,
              linout =T,
              maxit=1000
   )

So what am I doing wrong, how can I increase the size parameter without having error? I have a data with 338 observations and 2830 variables.

Thank you in advance

Upvotes: 2

Views: 3698

Answers (1)

Hack-R
Hack-R

Reputation: 23200

download.file(url= "https://cloud.mail.ru/public/6e188e2baa1f%2Fdata.RData",
                    destfile = "data.RData")
load("C:\\Users\\jmiller\\Downloads\\data.RData")
require(nnet)
require(caret)

set.seed(1)
set.seed(123)
seeds <- vector(mode = "list", length = 51)
for(i in 1:50) seeds[[i]] <- sample.int(1000, 22)

fitControl <- trainControl(method = "adaptive_cv", 
                                    repeats = 5,
                                    verboseIter = TRUE,
                                    seeds = seeds)

net <- train(Y~., data=af,
             method = 'nnet',
             preProcess = NULL,
             trControl = fitControl,
             verbose = FALSE,
             tuneGrid=data.frame(.size=3, .decay=5e-3),
             MaxNWts=4000,
             linout =T,
             maxit=1000,
             na.action = "na.omit"
)

At the root of your problem are degrees of freedom.

Your data object, af, has these dimensions:

> dim(af)
[1]  338 2830

and you're regressing all 2,829 columns (i.e. 2,830 - 1) on the 338 observations of Y, which is not valid due to insufficient degrees of freedom.

Your model fails with two messages:

  1. model fit failed...Error in nnet.default(x, y, w, ...) : too many (8494) weights
  2. There were missing values in resampled performance measures.

Like you, at first I focused on #2. I tried adding the na.action = "na.omit") parameter to caret's train() function, which is a good idea but which will not solve this problem. Then I realized that the first message was more to the point. It's telling you the same thing -- that the number of weights, which is based on the number of explanatory variables, is just way too high.

So I took what you have and reduced the number of regressors (columns used) and now it works fine: X <- af[,1:10]

> net <- train(y=af$Y, x=X, 
                 +                  method = 'nnet',
                 +                  preProcess = NULL,
                 +                  trControl = fitControl,
                 +                  verbose = FALSE,
                 +                  tuneGrid=data.frame(.size=3, .decay=5e-3),
                 +                  MaxNWts=4000,
                 +                  linout =T,
                 +                  maxit=1000,
                 +                  na.action = "na.omit"
                 +     )
  # weights:  37
  initial  value 11680.837531 
  iter  10 value 567.695784
  iter  20 value 536.975033
  iter  30 value 531.229395
  iter  40 value 529.138440
  iter  50 value 528.186846
  iter  60 value 527.256175
  iter  70 value 526.785917
  iter  80 value 524.733350
  iter  90 value 523.327685
  iter 100 value 523.153812
  iter 110 value 522.866651
  iter 120 value 522.520324
  iter 130 value 519.557709
  iter 140 value 519.034003
  iter 150 value 518.865745
  iter 160 value 518.817728
  iter 170 value 518.782720
  iter 180 value 518.733853
  iter 190 value 518.676881
  iter 200 value 518.664181
  iter 210 value 518.657738
  iter 220 value 518.656348
  iter 230 value 518.641653
  iter 240 value 518.634891
  iter 250 value 518.633536
  iter 260 value 518.632945
  iter 270 value 518.632559
  iter 280 value 518.632273
  final  value 518.632207 
  converged
  # weights:  37
  initial  value 17038.362932 
  iter  10 value 501.607084
  iter  20 value 492.283099
  iter  30 value 491.783153
  iter  40 value 490.882435
  iter  50 value 490.509487
  iter  60 value 490.347638
  iter  70 value 490.276362
  iter  80 value 490.135871
  iter  90 value 490.046862
  iter 100 value 490.042117
  iter 110 value 490.040552
  iter 120 value 490.038263
  iter 130 value 490.033484
  final  value 490.031813 
  converged
  # weights:  37
  initial  value 14383.923273 
  iter  10 value 689.694268
  iter  20 value 632.520128
  iter  30 value 506.806825
  iter  40 value 497.204789
  iter  50 value 496.058566
  iter  60 value 495.153170
  iter  70 value 493.646931
  iter  80 value 492.361818
  iter  90 value 492.088474
  iter 100 value 492.008913
  iter 110 value 491.498147
  iter 120 value 491.345332
  iter 130 value 491.306452
  iter 140 value 491.290815
  iter 150 value 491.278200
  iter 160 value 491.244056
  iter 170 value 491.232009
  iter 180 value 491.226179
  iter 190 value 491.212958
  iter 200 value 491.209201
  iter 210 value 491.208694
  final  value 491.208581 
  converged
  # weights:  37
  initial  value 11084.289612 
  iter  10 value 622.044592
  iter  20 value 520.053760
  iter  30 value 507.456841
  iter  40 value 502.000020
  iter  50 value 497.723332
  iter  60 value 495.164695
  iter  70 value 494.157722
  iter  80 value 492.678614
  iter  90 value 491.936464
  iter 100 value 491.706778
  iter 110 value 491.360232
  iter 120 value 491.141080
  iter 130 value 490.920068
  iter 140 value 490.714486
  iter 150 value 490.578641
  iter 160 value 490.542580
  iter 170 value 490.524909
  iter 180 value 490.486804
  iter 190 value 490.338614
  iter 200 value 490.110047
  iter 210 value 489.915521
  iter 220 value 489.812929
  iter 230 value 489.756234
  iter 240 value 489.718024
  iter 250 value 489.711636
  iter 260 value 489.702732
  iter 270 value 489.661668
  iter 280 value 489.644434
  iter 290 value 489.633946
  iter 300 value 489.625405
  iter 310 value 489.593120
  iter 320 value 489.572627
  iter 330 value 489.569740
  iter 340 value 489.528552
  iter 350 value 489.511607
  iter 360 value 489.508993
  final  value 489.508289 
  converged
  # weights:  37
  initial  value 16479.112645 
  iter  10 value 615.081555
  iter  20 value 501.104433
  iter  30 value 481.204462
  iter  40 value 479.305649
  iter  50 value 477.681071
  iter  60 value 476.668767
  iter  70 value 476.010448
  iter  80 value 475.480687
  iter  90 value 474.542118
  iter 100 value 473.701240
  iter 110 value 473.345255
  iter 120 value 473.099798
  iter 130 value 472.974764
  iter 140 value 472.849499
  iter 150 value 472.685356
  iter 160 value 472.521386
  iter 170 value 472.438417
  iter 180 value 472.294666
  iter 190 value 472.250269
  iter 200 value 472.231414
  iter 210 value 472.226611
  iter 220 value 472.222344
  iter 230 value 472.213415
  iter 240 value 472.208953
  iter 250 value 472.204120
  iter 260 value 472.199380
  iter 270 value 472.196929
  iter 280 value 472.195935
  iter 290 value 472.195052
  iter 300 value 472.194005
  iter 310 value 472.193293
  final  value 472.193152 
  converged
  # weights:  37
  initial  value 16669.126242 
  iter  10 value 888.977047
  iter  20 value 637.124939
  iter  30 value 624.802902
  iter  40 value 622.441515
  iter  50 value 618.396731
  iter  60 value 618.219093
  iter  70 value 617.815129
  iter  80 value 617.295605
  iter  90 value 616.955292
  iter 100 value 616.817402
  iter 110 value 616.766023
  iter 120 value 616.749874
  iter 130 value 616.745140
  iter 140 value 616.724481
  iter 150 value 616.706785
  iter 160 value 616.696147
  iter 170 value 616.516945
  iter 180 value 616.288825
  iter 190 value 615.986669
  iter 200 value 615.853741
  iter 210 value 615.759414
  iter 220 value 615.704034
  iter 230 value 615.669519
  iter 240 value 615.642135
  iter 250 value 615.626347
  iter 260 value 615.548658
  iter 270 value 615.526684
  iter 280 value 615.509073
  iter 290 value 615.494325
  iter 300 value 615.481876
  iter 310 value 615.462706
  iter 320 value 615.459488
  iter 330 value 615.450263
  iter 340 value 615.438322
  iter 350 value 615.430339
  iter 360 value 615.423634
  iter 370 value 615.418235
  iter 380 value 615.391565
  iter 390 value 615.383094
  iter 400 value 615.379039
  iter 410 value 615.367349
  iter 420 value 615.361426
  iter 430 value 615.358651
  final  value 615.358194 
  converged
  >  

Note that I chose the columns to be used in x arbitrarily. You can (and should) chose the columns based on their performance, unless you have some theoretical framework to guide your choices. For instance, you could select the columns with stepwise or lasso regression methods or by manually running various training models and comparing their performance with an ROC curve (available in ROCR).

Upvotes: 3

Related Questions