Maryam Nasseri
Maryam Nasseri

Reputation: 67

Training a neural network model in neuralnet package in R taking days and not finished yet

I have constructed a model in R (R Studio) for a neural network with two hidden layers using the neuralnet package. It is already 3 non-stop days but the model is not trained. All I see is the usual blinking cursor which is the sign of data in processing. There are no error or warning messages and nothing has crashed. All I need to know is if this is normal and how long I should expect before the model is trained and the object is created.

This is a relatively simple model with 1260 rows and 11 columns of continuous data as features predicting the 12th column ('labels') as a factor with six classes (multi-class classification task). Data has been properly scaled before splitting it to train and test datasets and the factor column of classes has been re-coded to numeric values with the revalue() in R. Here are the model specs:

 model = neuralnet(
        labels~A+B+C+D+E+F+G+H+I+J+K,        
    data=train_data,
    hidden=c(4,2),                
    linear.output = FALSE,       
    stepmax = 1e+06,             

    )

I have read many online posts including a number of Stack Overflow questions of similar nature but none gives me the answer I need. Is this caused by GPU or the complexity of the model? and how may I solve this issue? Here are my machine's specs:

Ubuntu 20.04.1 LTS 64 bit, RAM 48 GiB, processor AMD® Ryzen 5 2600 six-core processor × 12.

I can provide more information about the code and my machine if needed. I would appreciate it if anyone could help out.

Edit: Someone from the discussion suggested that I add more details about my code, so here are the pre-processing parts:

#revalue the labels:

data <- 
data %>%
mutate(labels = revalue(labels, c("label1" = 1, "label2" = 2, 
"label3" = 3, "label4" = 4, "label5" = 5, "label6" = 6)))

#scaling numeric columns
varnames<- c("labels")
index<- names(data) %in% varnames
data1<- scale(data[, !index])  
data = merge(data["labels"] ,data1)

# split training and testing datasets
train_idx <- sample(nrow(data), 0.7 * nrow(data))
train_data <- data[train_idx,]
test_data <- data[-train_idx,] 
data <- data %>% mutate_if(is.character, as.factor)

#here the above model is run

Upvotes: 0

Views: 746

Answers (1)

pabloabur
pabloabur

Reputation: 34

It's a very small network, a small dataset; it shouldn't take that long. On top of that your specs are good enough. My suggestion is that you run an example from the documentation first to see if you can get it running properly. One of the examples from the documentation:

library(neuralnet)
# Binary classification
nn <- neuralnet(Species == "setosa" ~ Petal.Length + Petal.Width, iris, linear.output = FALSE)
## Not run: print(nn)
## Not run: plot(nn)

Then go back to your code. One hour is already too much, so if that happens again just try to subsample your dataset. If it keeps happening, maybe there is some configuration you need to do on your linux machine.

Edit after question update: the merge command you used will Merge two data frames by common columns or row names, so my guess is that you are adding a lot of rows here. If I do the same with iris data set, the number of rows go from 150 to 22500. You can instead convert to data frame after scaling and use a simple bracket notation instead of merge:

> data1<- as.data.frame(scale(data[, !index]))
> data1['labels']=data['labels']

Upvotes: 1

Related Questions