JWH2006
JWH2006

Reputation: 239

Issue in bootstrapping a neural network in r

I have simulated some data to run through a neural net but I cannot get the function for repeating the neural net to work. I am not sure where my code is going wrong.

If I run the neural net one at a time, there is no problem, but as soon as I create a function to run 10 repititions, I get the following errors:

Error in nrow[w] * ncol[w] : non-numeric argument to binary operator

In addition: Warning messages: 1: algorithm did not converge in 1 of 1 repetition(s) within the stepmax 2: In is.na(weights) : is.na() applied to non-(list or vector) of type 'NULL'

library ("MASS")
library ("neuralnet")
set.seed(123)

Variable1 <- rnorm(n = 500, mean = 26, sd = 5)
Variable2 <- rnorm(n = 500, mean = 600, sd = 100)
Variable3 <- rnorm(n = 500, mean = 115, sd = 15)
group <- 1 
weight <- ((.3*(Variable1/36))+(.3*(Variable2/800))+(.3*(Variable3/145)))
pt1 <- cbind (group, Variable1, Variable2, Variable3, weight)

Variable1 <- rnorm(n = 500, mean = 21, sd = 5)
Variable2 <- rnorm(n = 500, mean = 500, sd = 100)
Variable3 <- rnorm(n = 500, mean = 100, sd = 15)
group <- 0
weight <- ((.3*(Variable1/36))+(.3*(Variable2/800))+(.3*(Variable3/145)))
pt2  <- cbind (group, Variable1, Variable2, Variable3, weight)

pt3 <- as.data.frame(rbind (pt1, pt2))
Outcome <- rbinom (n = 1000, size = 1, prob = pt3$weight)

Data_f <- as.data.frame (cbind(pt3, Outcome))
Data <- subset (Data_f, select = -weight)

data <- Data[, sapply(Data, is.numeric)]
maxValue <- as.numeric(apply (data, 2, max))
minValue <- as.numeric(apply (data, 2, min))

data_scaled <- as.data.frame(scale(data, center = minValue, 
    scale = maxValue-minValue))

k <- 10
library(plyr) 
pbar <- create_progress_bar('text')
pbar$init(k)
for(i in 1:k){
    ind <- sample (1:nrow(data_scaled),600)
    train <- data_scaled[ind,]
    test <- data_scaled[-ind,]
    nueral_model <- neuralnet(formula = 
        Outcome ~ Variable1 + Variable2 + Variable3,
        hidden = c(2,2),
        threshold = 0.01, 
        rep = 1,
        learningrate = NULL,
        algorithm = "rprop+",
        linear.output=FALSE, 
        data= train)
    results <- compute (nueral_model, test[2:4])

    results <- results$net.result*(max(data$Outcome)-
        min(data$Outcome))+ min(data$Outcome)
    Values <- (test$Outcome)*(max(data$Outcome)- 
        min(data$Outcome)) + min(data$Outcome)
    MSE_nueral_model[i] <- sum((results - Values)^2)/nrow(test)

    pbar$step()
}

Upvotes: 1

Views: 477

Answers (1)

EconomySizeAl
EconomySizeAl

Reputation: 230

It's not necessarily a problem of you being able to generate neural network correctly the on one repetition, but it may be the result of you generating the neural network correctly on one run given your random number seed (123 as listed in your code). The initial weights are built for a neural network given a set of random numbers. From those random initial weights it will refine until the neural network converges. Given 10 runs through this data, at least one of them starts with a set of weights that doesn't converge within the first 10000 iterations - 10000 being the default stepmax for the neuralnetwork function that you are using.

In order to get around this problem there are a handful of solutions. First, you can increase the stepmax, making it more likely for your neural network to converge

nueral_model <- neuralnet(formula = 
    Outcome ~ Variable1 + Variable2 + Variable3,
    hidden = c(2,2),
    threshold = 0.01,
    stepmax = 1e+07,
    rep = 1,
    learningrate = NULL,
    algorithm = "rprop+",
    linear.output=FALSE, 
    data= train)

You could also reset the threshold argument to make it easier to converge, by modifying the call threshold call to:

threshold = 0.025,

What may be the most preferable option, however, is using a try() or trycatch() statement when calculating your neural network. That way you should be able to run your neural network with a certain specificity/speed, and continue to calculate even for runs which do not converge. As a plus, you can look at your convergence %, by seeing what percent of your runs converge given a certain threshold and stepmax.

Upvotes: 1

Related Questions