Adit Sanghvi
Adit Sanghvi

Reputation: 154

Neuralnet formula in R

I am a beginner in R.

I am trying to learn how I can make neural networks in R and use them to predict an output. I found an example using a boston dataset online and was adapting it to test my code. It works (i am getting a MSE of 250 :( ) but I cannot understand this part of code.

   n <- names(train_)
   f <- as.formula(paste("pred_con ~", paste(n[!n %in% "pred_con"], collapse = " + ")))
   nn <- neuralnet(f,data=train_,hidden=c(5,3),linear.output=T)
   pr.nn <- compute(nn,test_[,1:5])

Can somebody explain how this works? Thanks!

Upvotes: 3

Views: 5295

Answers (1)

ChristyCasey
ChristyCasey

Reputation: 446

I think you mean this bit of code

   f <- as.formula(paste("pred_con ~", paste(n[!n %in% "pred_con"], collapse = " + ")))

So lets break this down piece by piece.

f is the variable name.

as.formula is forcing the variable type to a "formula" type. Which has the general form of Response~Variable_1+Variable_2. Which is saying: Use Variable 1 and Variable 2 to predict the Response value.

paste is a function that concatenates string pieces. So

paste("Str","ing",sep="") 

would give "String", with sep="" saying give me a separation between the inputs of "". Which is nothing

In the code you have it is using collapse = " + " which is putting a plus sign between the values, in the second paste function.

paste(n[!n %in% "pred_con"], collapse = " + ")

n is the names of the columns in the train_ set

n <- names(train_)

So paste(n , collapse = " + ") would give use each column name with a + sign between them.

However we dont want "pred_con" the value we are trying to predict. This is dealt with in the earlier part of that line of code.

So n[!n %in% "pred_con"] is saying every name, which is NOT "pred_con".

So from

paste(n[!n %in% "pred_con"], collapse = " + ")

We get every column name with a + sign between them, OTHER than "pred_con"

We want the formula from of Y~X1+X2

So paste the "pred_con" ahead of the list of column names we just made, using another paste statement. Giving us:

paste("pred_con ~", paste(n[!n %in% "pred_con"], collapse = " + "))

And finally, we make it of type formula instead of string so we wrap it with the as.formula function.

Which now brings us to the full line of:

 f <- as.formula(paste("pred_con ~", paste(n[!n %in% "pred_con"], collapse = " + ")))

The last two lines are just using the neural net package stuff so I wont focus on it.

 nn <- neuralnet(f,data=train_,hidden=c(5,3),linear.output=T)

This is just training your neural network. And storing it as "nn"

   pr.nn <- compute(nn,test_[,1:5])

This is predicting the values of the "test_" set, using "nn" and storing them in "pr.nn"

Upvotes: 11

Related Questions