Reputation: 97
I am developing Neural Networks in my SQLServer2017 with R.
I use the package MicrosoftML and the NYC TaxiData.
Goal: Neural Network to predict the "Ratecode" of a single TaxiRide
Here is the Code:
library(MicrosoftML)
library(dplyr)
dat_all <- InputData;
sizeAll <- length(InputData$tip_amount);
sample_train <- base::sample(nrow(dat_all),
size = (sizeAll*0.9))
sample_test <- base::sample((1:nrow(dat_all))[-sample_train],
size = (sizeAll*0.1))
dat_train <- dat_all %>%
slice(sample_train)
dat_test <- dat_all %>%
slice(sample_test);
form <- Rate ~ total_amount+trip_distance+duration_in_minutes+passenger_count+PULocationID+DOLocationID;
model <- rxNeuralNet(
formula = form,
data = dat_train,
type = "multiClass",
verbose = 1);
trained_model <- data.frame(payload = as.raw(serialize(model, connection=NULL)));
The Rate is successfully detected as a factor with size 5, representing different rates such as "Standard" or "JFK".
When running the Code, I get the following Error:
Error: All rows in data has missing values (N/A). Please clean missing data before training. Error in processing machine learning request. Fehler in doTryCatch(return(expr), name, parentenv, handler) : Error: All rows in data has missing values (N/A). Please clean missing data before training. Error in processing machine learning request. Ruft auf: source ... tryCatch -> tryCatchList -> tryCatchOne -> doTryCatch -> .Call
The very same error occurs when replacing the rate with a rateID.
I estimate that there is some form of Transformation to get this working, but somewhat the documentation of MS is lacking at this Point.
Here is the verbose of my NN before it wipes:
***** Net definition *****
input Data [6];
STDOUT message(s) from external script:
hidden H [100] sigmoid { // Depth 1
from Data all;
}
output Result [5] softmax { // Depth 0
from H all;
}
***** End net definition *****
Input count: 6
Output count: 5
Output Function: SoftMax
Loss Function: LogLoss
PreTrainer: NoPreTrainer
___________________________________________________________________
Starting training...
Learning rate: 0,001000
Momentum: 0,000000
InitWtsDiameter: 0,100000
___________________________________________________________________
Initializing 1 Hidden Layers, 1205 Weights...
Elapsed time: 00:00:00.7222942
Upvotes: 0
Views: 127
Reputation: 97
I figured it out, here is the working Code:
library(MicrosoftML)
library(dplyr)
netDefinition <- ("
input Data auto;
hidden Mystery [100] sigmoid from Data all;
hidden Magic [100] sigmoid from Mystery all;
output Result auto softmax from Magic all;
")
dat_all <- InputData;
LocationLevels <- as.factor(c(1:265));
dat_all$PULocationID <- factor(dat_all$PULocationID, levels=LocationLevels);
dat_all$DOLocationID <- factor(dat_all$DOLocationID, levels=LocationLevels);
dat_all$RatecodeID <- factor(dat_all$RatecodeID, levels=as.factor(c(1:6)) );
form <- RatecodeID ~ trip_distance+total_amount+duration_in_minutes+passenger_count+PULocationID+DOLocationID;
model <- rxNeuralNet(
formula = form,
data = dat_all,
netDefinition=netDefinition,
type = "multiClass",
numIterations = 100,
verbose = 1);
trained_model <- data.frame(payload = as.raw(serialize(model, connection=NULL)));
Main Issue was Factorizing the Data correctly.
Upvotes: 0